多机器学习模型下南京市PM_(2.5)预测分析  

Predictive Analysis of PM_(2.5)in Nanjing under Multiple Machine Learning Models

在线阅读下载全文

作  者:鞠杨 JU Yang(Honor College of Nanjing Normal University,NanJing Jiangsu 210046,China)

机构地区:[1]南京师范大学强化培养学院,江苏南京210046

出  处:《环境科学导刊》2025年第2期46-52,共7页Environmental Science Survey

摘  要:针对南京市PM_(2.5)浓度预测问题,采用了五种不同的机器学习模型:多元线性回归、随机森林、K最邻近模型(KNN)、BP神经网络模型(BPNN)和极端梯度提升算法(XGBoost)。研究基于南京市2021年和2022年的空气质量及气象数据,通过数据预处理和特征缩放,对模型进行训练和测试。评估指标包括相关系数(R2)、均方差(RMSE)、平均绝对误差(MAE)和平均绝对百分比误差(MAPE)。研究结果表明,五种模型总体上预测性能良好,其中随机森林模型的预测精度最高,误差最小。不同季节的预测精度分析显示,多元线性回归和BP神经网络模型(BPNN)在春季和冬季的预测精度高于夏季和秋季;而随机森林、K最邻近模型(KNN)和极端梯度提升算内存占用最多,而K最邻近模型(KNN)模型的运行时间和内存占用最少。综合考虑预测精度和运行效率,随机森林模型在南京市PM_(2.5)浓度预测中表现最佳。In this study,five different machine learning models were used for the PM_(2.5)concentration prediction problem in Nanjing:multiple linear regression,random forest,K Nearest Neighbor Model(KNN),BP neural network,and eXtreme Gradient Boosting XGBoost.The study was based on the air quality and meteorological data of Nanjing for the years of 2021 and 2022,and the models were trained and tested by data preprocessing and feature scaling.The evaluation metrics included correlation coefficient,mean squared error RMSE,mean absolute error MAE and mean absolute percentage error MAPE.The results showed that the five models had good prediction performance in general,with the Random Forest model having the highest prediction accuracy and the minimum error.The analysis of the prediction accuracy in different seasons showed that the prediction accuracy of multiple linear regression and BP neural network was higher in spring and winter than in summer and fall.While the random forest,K Nearest Neighbor Model(KNN)and eXtreme Gradient Boosting XGBoost models had the highest prediction accuracy in winter.In terms of model running efficiency,the BP neural network had the longest training time and the most memory usage,while the K Nearest Neighbor Model(KNN)model had the least running time and memory usage.Considering the prediction accuracy and running efficiency,the random forest model performed best in predicting PM_(2.5)concentration in Nanjing.The methods and models in this study could also provide references for air quality prediction in other regions.

关 键 词:气象因子 PM_(2.5)预测 机器学习 多元线性回归模型 随机森林模型 K最邻近模型(KNN) BP神经网络模型(BPNN) 极端梯度提升算法(XGBoost) 

分 类 号:X51[环境科学与工程—环境工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象