基于异常序列剔除的多变量时间序列结构化预测  被引量:11

Structural Prediction of Multivariate Time Series Through Outlier Elimination

在线阅读下载全文

作  者:毛文涛[1,2,3] 蒋梦雪 李源 张仕光[1,2] MAO Wen-Tao;JIANG Meng-Xue;LI Yuan;ZHANG Shi-Guang(College of Computer and Information Engineering, Henan Normal University, Xinxiang 453007;Computational Intelligence and Data Mining Engineering Technology Research Center of Colleges and Universities of Henan Province, Xinxiang 453007;School of Mechanics and Civil & Architecture, Northwestern Polytechnical University, Xi'an 710129)

机构地区:[1]河南师范大学计算机与信息工程学院,新乡453007 [2]河南省高校计算智能与数据挖掘工程技术研究中心,新乡453007 [3]西北工业大学力学与土木建筑学院,西安710129

出  处:《自动化学报》2018年第4期619-634,共16页Acta Automatica Sinica

基  金:国家自然科学基金(U1204609);中国博士后科学基金(2016T90944);河南省高校科技创新人才资助计划(15HASTIT022);河南省高校青年骨干教师资助计划(2014GGJS-046);河南师范大学优秀青年科学基金(14YQ007);河南省高等学校重点科研项目(16A520015)资助~~

摘  要:针对传统多变量时间序列预测方法未考虑变量间依赖关系从而影响预测效果的问题,提出了一种基于异常序列剔除的多变量时间序列预测算法.该算法旨在利用多维支持向量回归机(Multi-dimensional support vector regression,M-SVR)内在的结构化输出特性,对选取到具有相似性的多个变量序列进行联合预测.首先,对已知序列进行基于模糊熵的层次聚类,实现对相似序列的初步划分;其次,求出类中所有序列的主曲线,根据序列到主曲线的距离计算各个序列的异常因子,从而进一步剔除聚类结果中的异常序列;最后,将选取到的相似变量序列作为输入,利用M-SVR进行预测.通过理论分析,证明本文算法在理论上存在信息损失上界与可靠度下界,从而说明本文算法的合理性与可行性.采用混沌时间序列数据与多个实际数据集进行对比实验,结果表明,与现有多个代表性方法相比,本文算法可有效挖掘多变量时间序列的内在结构信息,预测精度更高,数值稳定性更好.To solve the problem that the traditional multivariate time series prediction generally ignores the dependency among all variables, a new multivariate time series structural prediction method through outlier elimination is proposed. This algorithm predicts on the selected multivariate time series by using the structural output characteristic.Firstly, to recognize the relatedness among the sequences, the variable sequences are initially divided by hierarchical clustering according to fuzzy entropy. Secondly, to further evaluate the similarity of the sequences in the obtained cluster, the principal curve is introduced to calculate the abnormality degree of each sequence, and then the outlier sequence can be eliminated in terms of the value of abnormality degree. As a result, similar sequences can be distinguished. Finally, for the similar series, multi-dimensional support vector regression(M-SVR) is used to construct the prediction model, and then the structural prediction for multivariate time series is conducted. Moreover, a theoretical proof is provided to show the proposed method has an upper bound of the loss of information and a lower bound of reliability and that the proposed method is reasonable and feasible from the perspective of information entropy. Experiments are conducted on three chaotic time series datasets and five real-life datasets. The results show that the proposed method can effectively recognize the inner group structure among multivariable sequences, so as to obtain a better forecasting accuracy and numerical stability than those widely used methods in terms of two different error measurements.

关 键 词:时间序列聚类 主曲线 异常序列 多维支持向量回归机 

分 类 号:O211.61[理学—概率论与数理统计]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象