TY - JOUR
T1 - Improved Ensemble Classification for Evolving Data Streams
AU - Tian, Hui
AU - Wang, Lulu
AU - Shen, Hong
AU - Liew, Alan Wee Chung
N1 - Publisher Copyright:
© 2001-2011 IEEE.
PY - 2022
Y1 - 2022
N2 - A major challenge for evolving data stream classification is feature evolution where features of stream instances are dynamically changing as they progress. Existing classification methods considered feature evolution either for fixed-size data or of limited degree with presumed dependence to history, making them unable to work effectively on evolving data streams of unbounded size and arbitrary feature evolution. Particularly, for evolving data streams containing instances of multiple labels, classification coping with feature evolution faces significant challenges. In this article, we present efficient ensemble methods for classifying evolving data streams of both single label and multiple labels through effective model coupling. For single-label classification, we present an improved unsupervised classification algorithm that applies multi-cluster feature selection (MCFS), which was originally proposed for static data classification, in the DXMiner framework to handle each window of instances in a dynamic stream. Our method generates an optimal feature subset and achieves a high classification accuracy. We further improve the time complexity of the feature selection process in MCFS by applying the Ball-tree searching technique. For multi-label classification, we propose an effective fixed-size ensemble classifier based on multi-label KNN, which works only for static multi-label data classification, by incorporating a weight adaptation strategy among the classifiers in the ensemble to dynamically update the model and cope with arbitrary feature evolution of stream instances as the stream progresses. Extensive experiment results on real-life data streams show that our algorithms outperform the existing results for single-label and multi-label classification in classification accuracy and efficiency.
AB - A major challenge for evolving data stream classification is feature evolution where features of stream instances are dynamically changing as they progress. Existing classification methods considered feature evolution either for fixed-size data or of limited degree with presumed dependence to history, making them unable to work effectively on evolving data streams of unbounded size and arbitrary feature evolution. Particularly, for evolving data streams containing instances of multiple labels, classification coping with feature evolution faces significant challenges. In this article, we present efficient ensemble methods for classifying evolving data streams of both single label and multiple labels through effective model coupling. For single-label classification, we present an improved unsupervised classification algorithm that applies multi-cluster feature selection (MCFS), which was originally proposed for static data classification, in the DXMiner framework to handle each window of instances in a dynamic stream. Our method generates an optimal feature subset and achieves a high classification accuracy. We further improve the time complexity of the feature selection process in MCFS by applying the Ball-tree searching technique. For multi-label classification, we propose an effective fixed-size ensemble classifier based on multi-label KNN, which works only for static multi-label data classification, by incorporating a weight adaptation strategy among the classifiers in the ensemble to dynamically update the model and cope with arbitrary feature evolution of stream instances as the stream progresses. Extensive experiment results on real-life data streams show that our algorithms outperform the existing results for single-label and multi-label classification in classification accuracy and efficiency.
KW - Data stream classification
KW - ensemble classifier
KW - feature evolution
KW - multi-label classification
UR - https://www.scopus.com/pages/publications/85095983426
U2 - 10.1109/MIS.2020.3033322
DO - 10.1109/MIS.2020.3033322
M3 - Article
AN - SCOPUS:85095983426
SN - 1541-1672
VL - 37
SP - 38
EP - 50
JO - IEEE Intelligent Systems
JF - IEEE Intelligent Systems
IS - 1
ER -