水资源消耗预测的异常值检测及缺失数据填补方法  被引量:8

Method for Abnormal Data Detection and Missing Data Filling in Water Resources Consumption Forecasting

在线阅读下载全文

作  者:张峰 宋晓娜[3] 薛惠锋[2] 王海宁[2] Zhang Feng;Song Xiaona;Xue Huifeng;Wang Haining(School of Management,Shandong University of Technology,Zibo Shandong 255012,China;China Academy of Aerospace System Science and Engineering,Beijing 100048,China;School of Business,Taishan University,Taian Shandong 271000,China)

机构地区:[1]山东理工大学管理学院,山东淄博255012 [2]中国航天系统科学与工程研究院,北京100048 [3]泰山学院商学院,山东泰安271000

出  处:《统计与决策》2018年第16期13-17,共5页Statistics & Decision

基  金:国家自然科学基金资助项目(71371112);国家自然科学基金重点项目(U1501253);广东省省级科技计划项目(2016B010127005);山东省自然科学基金资助项目(ZR2012GM020);中央分成水资源费项目(2016H22SK041)

摘  要:可靠完整的水资源消耗历史时序数据是对其进行准确预测的基本前提。文章在参考现有数据异常值检测与缺失值处理方法的基础上,选取偏最小二乘法提取水资源消耗及社会经济相关指标数据主成分,并绘制其累计贡献度的Q2椭圆图辨识其存在的异常值,利用最小残差回归法对含有实际突变的时序数据进行预测分析,再构建基于粒子群优化的最小二乘支持向量机模型对其缺失数据进行填补。结果表明,通过偏最小二乘测算出主成分累计贡献度及绘制Q2椭圆图方法可借助异常值对整体数据的拉伸效应实现对异常点的检测;基于最小残差回归法对含有水资源消耗突变数据序列的预测要比传统最小二乘回归具有更高的精度;而运用粒子群优化的最小二乘支持向量机可进一步提升数据拟合效果,实现对水资源消耗缺失数据的合理填补。Historical time-series data of water resources consumption should be complete and reliable so that it can be used for forecasting. On the basis of referring to the existing methods of abnormal data detection and missing data filling, this paper uses partial least square method to extract the principal component of water resource consumption and socio-economic related index data and draw the elliptical diagram of its cumulative contribution degree Q2 to identify its abnormal value. And then the paper employs the least residual regression method to predict the time series data with actual mutation, and constructs the least square support vector machine(SVM) model based on particle swarm optimization to fill the missing data. Results show that the total contribution of principal components calculated by partial least square and the method of drawing Q2 elliptical diagram can be utilized to predict abnormal points with the help of the stretching effect of outliers on the whole data; the minimum residual error regression method has higher precision than traditional least square method in the prediction of water resource consumption mutation data series; while use of least squares SVM for particle swarm optimization can further improve data fitting effect and realize a reasonable filling of missing data of water resource consumption.

关 键 词:水资源消耗 异常值 缺失值 数据检测 数据填补 

分 类 号:TV213.4[水利工程—水文学及水资源]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象