Clustering Algorithms based Noise Identification from Air Pollution Monitoring Data

Xinyi Fang, Chak Fong Chong, Xu Yang, Yapeng Wang

研究成果: Conference contribution同行評審

1 引文 斯高帕斯(Scopus)

摘要

The development of data science has brought about many discussions of noise detection, and so far, there is no universal best method. In this paper, we propose a clustering-algorithm-based solution to identify and remove noise from air pollution data collected with mobile portable sensors. The test dataset is the air pollution data collected by the portable sensors throughout three seasons at the campus in Macao. We have applied and compared six clustering algorithms to identify the most appropriate clustering algorithm to achieve this goal: Simple K-means, Hierarchical Clustering, Cascading K-means, X-means, Expectation Maximization, and Self-Organizing Map. The performance is evaluated by their accuracy and the best number of clusters calculated by the Silhouette Coefficient. Additionally, a classification algorithm J48 tree can extract the key attributes and identify the noise cluster for future unlabeled data that may contain noise. The experiment results indicate that the Expectation Maximization and Cascading Simple K-Means perform the best. Moreover, temperature and carbon dioxide are vital attributes in identifying the noise cluster.

原文English
主出版物標題Proceedings of IEEE Asia-Pacific Conference on Computer Science and Data Engineering, CSDE 2022
發行者Institute of Electrical and Electronics Engineers Inc.
ISBN(電子)9781665453059
DOIs
出版狀態Published - 2022
事件2022 IEEE Asia-Pacific Conference on Computer Science and Data Engineering, CSDE 2022 - Gold Coast, Australia
持續時間: 18 12月 202220 12月 2022

出版系列

名字Proceedings of IEEE Asia-Pacific Conference on Computer Science and Data Engineering, CSDE 2022

Conference

Conference2022 IEEE Asia-Pacific Conference on Computer Science and Data Engineering, CSDE 2022
國家/地區Australia
城市Gold Coast
期間18/12/2220/12/22

指紋

深入研究「Clustering Algorithms based Noise Identification from Air Pollution Monitoring Data」主題。共同形成了獨特的指紋。

引用此