Construct robust rule sets for classification

Jiuyong Li, Rodney Topor, Hong Shen

研究成果: Paper同行評審

15 引文 斯高帕斯(Scopus)

摘要

We study the problem of computing classification rule sets from relational databases so that accurate predictions can be made on test data with missing attribute values. Traditional classifiers perform badly when test data are not as complete as the training data because they tailor a training database too much. We introduce the concept of one rule set being more robust than another, that is, able to make more accurate predictions on test data with missing attribute values. We show that the optimal class association rule set is as robust as the complete class association rule set. We then introduce the k-optimal rule set, which provides predictions exactly the same as the optimal class association rule set on test data with up to k missing attribute values. This leads to a hierarchy of k-optimal rule sets in which decreasing size corresponds to decreasing robustness, and they all more robust than a traditional classification rule set. We introduce two methods to find k-optimal rule sets, i.e. an optimal association rule mining approach and a heuristic approximate approach. We show experimentally that a k-optimal rule set generated by the optimal association rule mining approach performs better than that by the heuristic approximate approach and both rule sets perform significantly better than a typical classification rule set (C4.5Rules) on incomplete test data.

原文English
頁面564-569
頁數6
DOIs
出版狀態Published - 2002
對外發佈
事件KDD - 2002 Proceedings of the Eight ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - Edmonton, Alta, Canada
持續時間: 23 7月 200226 7月 2002

Conference

ConferenceKDD - 2002 Proceedings of the Eight ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
國家/地區Canada
城市Edmonton, Alta
期間23/07/0226/07/02

指紋

深入研究「Construct robust rule sets for classification」主題。共同形成了獨特的指紋。

引用此