跳至主導覽 跳至搜尋 跳過主要內容

Effective Density-Based Concept Drift Detection for Evolving Data Streams

  • Zelin Cui
  • , Hui Tian
  • , Hong Shen

研究成果: Conference contribution同行評審

1 引文 斯高帕斯(Scopus)

摘要

Concept drift is a common phenomenon appearing in evolving data streams of a wide range of applications including credit card fraud protection, weather forecast, network monitoring, etc. For online data streams it is difficult to determine a proper size of the sliding window for detection of concept drift, making the existing dataset-distance based algorithms not effective in application. In this paper, we propose a novel framework of Density-based Concept Drift Detection (DCDD) for detecting concept drifts in data streams using density-based clustering on a variable-size sliding window through dynamically adjusting the size of the sliding window. Our DCDD uses XGBoost (eXtreme Gradient Boosting) to predict the amount of data in the same concept and adjusts the size of the sliding window dynamically based on the collected information about concept drifting. To detect concept drift between two datasets, DCDD calculates the distance between the datasets using a new detection formula that considers the attribute of time as the weight for old data and calculates the distance between the data in the current sliding window and all data in the current concept rather than between two adjacent windows as used in the exiting work DCDA [2]. This yields an observable improvement on the detection accuracy and a significant improvement on the detection efficiency. Experimental results have shown that our framework detects the concept drift more accurately and efficiently than the existing work.

原文English
主出版物標題Parallel and Distributed Computing, Applications and Technologies - Proceedings of PDCAT 2023
編輯Ji Su Park, Hiroyuki Takizawa, Hong Shen, James J. Park
發行者Springer Science and Business Media Deutschland GmbH
頁面190-201
頁數12
ISBN(列印)9789819982103
DOIs
出版狀態Published - 2024
事件24th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2023 - Jeju, Korea, Republic of
持續時間: 16 8月 202318 8月 2023

出版系列

名字Lecture Notes in Electrical Engineering
1112 LNEE
ISSN(列印)1876-1100
ISSN(電子)1876-1119

Conference

Conference24th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2023
國家/地區Korea, Republic of
城市Jeju
期間16/08/2318/08/23

指紋

深入研究「Effective Density-Based Concept Drift Detection for Evolving Data Streams」主題。共同形成了獨特的指紋。

引用此