基于改进机器学习的PM_(2.5)浓度预测模型研究  

Study of PM_(2.5)concentration prediction model based on improved machine learning

在线阅读下载全文

作  者:丁成亮 郑洪波[1] DING Chengliang;ZHENG Hongbo(School of Environmental Science and Technology,Dalian University of Technology,Dalian 116024,China)

机构地区:[1]大连理工大学环境学院,辽宁大连116024

出  处:《大连理工大学学报》2024年第4期353-360,共8页Journal of Dalian University of Technology

基  金:国家自然科学基金资助项目(42071273);中央高校基本科研业务费专项资金资助项目(DUT22LAB132)。

摘  要:针对现有机器学习模型预测PM_(2.5)浓度存在模型过于复杂、没有考虑时空信息和缺失值填补不准确而导致模型性能下降的问题,利用随机森林取代统计学方法填补缺失值,并纳入时空因素提升模型精度.建立了综合遥感数据、气象及协同污染物数据,适用于沿海城市的PM_(2.5)浓度预测模型(K-means-RF-XGBoost模型),模型预测耗时仅为BP神经网络的4%.利用2019年大连市实时监测数据对模型PM_(2.5)浓度预测进行训练和测试,结果表明,建立的K-means-RF-XGBoost模型预测PM_(2.5)浓度有很高的准确性,与没有考虑时空信息的同种模型相比均方根误差(erms)降低了约48%,决定系数(R^(2))提升了约10%;能有效地预测高PM_(2.5)浓度并适用于波动范围大的情况,如春季模型在测试集中R^(2)可达0.935;同时在日级预测上表现优异,R^(2)可达0.819.该研究为沿海城市PM_(2.5)浓度预测提供了新思路.In response to the problem of performance decrease of existing machine learning model for predicting PM_(2.5)concentration because that the model is too complex,and does not consider spatio-temporal information and effective missing values imputation is not accurate,random forest is used instead of statistical methods to fill in missing values,and spatio-temporal factors are incorporated to improve model accuracy.Combining remote sensing data,meteorological and collaborative pollutant data,a model(K-means-RF-XGBoost model)suitable for PM_(2.5)concentration prediction in coastal cities is established,with a prediction time of only 4%of that of BP neural networks.The prediction of PM_(2.5)concentration of the model is trained and tested using real-time monitoring data from Dalian in 2019.The results show that the established K-means-RF-XGBoost model has high accuracy in predicting PM_(2.5)concentration,and compared to the same model without considering spatio-temporal information,the root mean square error(e rms)decreases by about 48%,and coefficient of determination(R^(2))increases by about 10%.It effectively predicts high PM_(2.5)concentrations and is suitable for large fluctuation ranges,such as an R^(2) of 0.935 is achieved in the testing set for the spring model.At the same time,it performs well in daily prediction,with an R^(2) of 0.819.This study provides a new idea for predicting PM_(2.5)concentration in coastal cities.

关 键 词:PM_(2.5)浓度预测 时空信息 缺失值填补 机器学习 

分 类 号:X513[环境科学与工程—环境工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象