MODIFIED INFO GAIN ATTRIBUTE EVAL FEATURE SELECTION ALGORITHM TO INCREASE EFFICIENCY OF CLASSIFICATION ALGORITHMS IN DATA MINING

  • Kamaldeep Kaur .

Abstract

In current scenarios large quantity of data gets produced and collected daily from various organizational processes. This huge data is accumulated in big databases and also in data warehouses. Then this raw data is used to obtain patterns, which are most useful for the organizations, techniques used for obtaining these patterns include classification, clustering and many other methods of data mining. but the main challenge for getting accurate patterns is the vast amount of data, this stored data having attributes or also called features, most of the times these features are redundant i.e. duplicate values, non-relevant and noisy i.e. not the right values these attributes or features degrade the  efficiency of classification algorithm applied. Results produced by non relevant features are less accurate. Relevant features selection is necessity. So there comes the need to remove these unwanted redundant features which is done by feature selection methods under data mining Feature selection includes three types filter, wrapper and hybrid methods which produces the optimal subset of features, and results in more accurate classification performance .In this study we implemented a new improved filter feature selection algorithm named InfoGainAttributeEval, this is done by changing the entropy method from SHANNONS entropy calculation to RENYIS entropy. And also compared the accuracy of our modified algorithm with number of hybrid algorithms with genetic and greedy search techniques on J48 and nae bayes classification algorithms and got the result that our improved algorithm gives more optimal results.

Published
2019-12-03