A Big Data Analysis and Data Mining For Used Map Reduce

P. Shanmugapriya; M. Valarmathi

P. Shanmugapriya
M. Valarmathi

Abstract

Big Data insertion is the capability of getting useful data from big datasets or stream of data. The existing scheme attempts to search the pattern of interest from probabilistic database. The tree based algorithm is widely used. But this algorithm mines without focusing on the users constraints. To do the data mining, we use R programming, which helps to organize the data and to visualize the data sets. Due to massive amount of data exchanged everyday and the increased need for better data-based decision, businesses nowadays are looking for ways to professionally manage, and optimize these huge datasets. Moreover, because of globalization, partnerships, value networks, emergence of social networks, and the huge information flow across and within enterprises, more and more businesses are interested in utilizing big data analytics. Big data analysis and machine learning techniques are essential to reap these benefits. The most important focal point of this paper is to explain information on the features of big data analytics literature as well as survey the areas that require adequate research within the big data analytics area, suggest future research avenues, as well as, present the current research findings that could aid practitioners, researchers, and vendors when embarking on big data analytics projects. The experimental results are conducted on Hadoop framework by loading large number of input files and evaluating the performance of Hadoop framework with respect to Map Reduce object oriented programming paradigm. Data mining delivers insights, patterns, and descriptive and predictive models from the large amounts of data available today in many organizations. The data miner draws heavily on methodologies, techniques and algorithms from statistics, machine learning, and computer science. R increasingly provides a powerful platform for data mining.