检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王智 张志强[1,2] 谢晓芹[1,2] 潘海为[1] WANG Zhi;ZHANG Zhi-qiang;XIE Xiao-qin;PAN Hai-wei(College of Computer Science and Technology,Harbin Engineering University,Harbin 150001,China;Research Center for Intelligent Information Processing,Harbin Engineering University,Harbin 150001,China)
机构地区:[1]哈尔滨工程大学计算机科学与技术学院,黑龙江哈尔滨150001 [2]哈尔滨工程大学智能信息处理研究中心,黑龙江哈尔滨150001
出 处:《软件》2018年第10期156-163,共8页Software
摘 要:针对当前空气质量预报对PM2.5浓度预测不准确的问题,本文使用提升树模型预测PM2.5浓度,利用特征重要性提升了算法效率,并分析了不同特征对预测PM2.5浓度准确率的影响。首先从北京地区数个气象观测站2016年1月到12月的气象数据中抽取温度、风速等六种气象因子,再利用同时期北京十二个国控点的六种空气污染物浓度数据构成了特征向量。接下来利用提升树(BoostingTree)对未来24小时内的PM2.5浓度进行预测,与线性回归(LR)进行了对比,最后通过提取特征重要性信息对预测模型进行了改进,并分析了对PM2.5浓度影响较大的特征。对模型预测结果采用K-折交叉验证,实验结果表明,相比线性回归模型,本文所提出的基于提升树的PM2.5浓度预测模型对未来24小时内的浓度预测准确率高10%至30%,改进后的算法效率提升了20%。Aiming at the problem of inaccurate prediction of PM2.5 concentration in current air quality forecast, this paper uses a boosting tree model to predict the concentration of PM2.5, the importance of features is used to improve the efficiency of algorithm, and the influence of different features on the accuracy of PM2.5 concentration prediction is analyzed. First, six meteorological factors, such as temperature and wind speed, were extracted from meteorological data from several meteorological monitoring sites in Beijing from January to December of 2016, and six kinds of air pollutant concentration data from twelve national control sites in Beijing were extracted to constitute the feature vector. Next, using the boosting tree model to predict the PM2.5 concentration over the next 24 hours, and compare it with linear regression (LR) model. Finally, the prediction model is improved by extracting the fea-ture importance information, and the features which have a great influence on PM2.5 concentration are analyzed. Using K-fold cross-validation to estimate the accuracy of the model,the experimental results show that, compared with linear regression model, the PM2.5 concentration prediction model based on boosting tree proposed in this pa-per has a high accuracy of 10% to 30% over the next 24 hours, and the efficiency of the improved algorithm was in-creased by 20%.
关 键 词:机器学习 空气污染 PM2.5浓度预测 提升树 XGBoost
分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222