Abstract
In data mining and knowledge discovery, evaluation functions for evaluating the quality of features have great influence on the outputs of feature selection algorithms. However, in the existing entropy-based feature selection algorithms from incomplete data, evaluation functions are often inadequately computed as a result of two drawbacks. One is that the existing evaluation functions have not taken into consideration the differences of discernibility abilities of features. The other is that in the feature selection algorithms of forward greedy search, if the feature with the same entropy value is not only one, the arbitrary selection may affect the classification performance. This paper introduces a new evaluation function to overcome the drawbacks. A main advantage of the proposed evaluation function is that the granularity of classification is considered in the evaluation computations for candidate features. Based on the new evaluation function, an entropy-based feature selection algorithm from incomplete data is developed. Experimental results show that the proposed evaluation function is more effective than the existing evaluation functions in terms of classification accuracy.
Original language | English |
---|---|
Pages (from-to) | 98-109 |
Number of pages | 12 |
Journal | Lecture Notes in Computer Science |
Volume | 8444 LNAI |
Issue number | PART 2 |
DOIs | |
Publication status | Published - 2014 |
Externally published | Yes |
Event | 18th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2014 - Tainan, Taiwan, Province of China Duration: 13 May 2014 → 16 May 2014 |
Keywords
- Conditional entropy
- Evaluation function
- Feature selection
- Incomplete data
- Rough sets