ORIGINAL RESEARCH
Applying Machine-Learning Methods Based on
Causality Analysis to Determine Air
Quality in China
More details
Hide details
1
Communication University of Zhejiang, Hangzhou, China
Submission date: 2018-08-29
Final revision date: 2018-10-27
Acceptance date: 2018-11-07
Online publication date: 2019-05-29
Publication date: 2019-07-08
Corresponding author
Bocheng Wang
Communication University of Zhejiang, Hangzhou xueyuan street No.998,Zhejiang province, 310018 hangzhou, China
Pol. J. Environ. Stud. 2019;28(5):3877-3885
KEYWORDS
TOPICS
ABSTRACT
A novel method was proposed for identifying air quality in China. Causality analysis-based
significance tests combined with different machine-learning algorithms were carried out to achieve
an automated and accurate classification. To this end, the most developed 100 cities in China were
selected as study areas. We analyzed meteorological factors such as temperature, humidity, precipitation,
wind speed, air pressure, sunshine duration, evaporation and grand surface temperature, and the
individual industrial pollutants of NO2, SO2, CO and O3 by means of time series from a large amount
of air monitoring data, and focused on the causality influence of the accumulative process of each
pollution ingredient on PM2.5. In order to better clarify the formation of haze, joint regression models
were established to quantify the influence degree of different factors on the cause of PM2.5. Different
classification models, including KNN, SVM, ensemble and decision tree were trained and tested to
predict air quality. An accuracy of 90.2% with the ensemble (boosted trees) classifier was obtained in
this study. Results of feature selection and classification both indicated that NO2 took an important role
in the contribution of PM2.5 concentrations during 2015-2017 in China.