基于极值点特征识别的大规模时序数据压缩分析  被引量:6

Analysis on Large-scale Time Series Data Compression Based on Extreme Point Feature Recognition

在线阅读下载全文

作  者:卢民荣[1,2] 郑建宁 Lu Minrong;Zheng Jianning(School of Accounting,Fujian Jiangxia University,Fuzhou 350108,China;Finance and Accounting Research Center,Fujian Jiangxia University,Fuzhou 350108,China;Fujian Yili Electric Power Technology Co.,Ltd.,Fuzhou 350003,China)

机构地区:[1]福建江夏学院会计学院,福州350108 [2]福建江夏学院财务与会计研究中心,福州350108 [3]福建亿力电力科技有限责任公司,福州350003

出  处:《统计与决策》2021年第20期39-43,共5页Statistics & Decision

基  金:福建省社会科学基金重大项目(FJ2019JDZ053,FJ2020JDZ068,FJ2020JDZ070);福建省财政资助科研项目(2021-11)。

摘  要:在大数据背景下,结合时间序列特点,数据量呈现多维急剧增长,极点在数据分析和预测中扮演相当重要的角色,如减缓大数据分析压力,然而传统的极点提取方法存在着不完整或极点提取错误的缺陷。为此,文章对时间序列数据进行归一化处理后,以极点的特殊性按交叉区间、趋势明显和趋势不明显改进极点提取算法,分析了改进后的等长区间的极点提取算法优势,以及通过实验对比优化后的基于趋势自适应处理的极点提取效果,结果表明该算法适应趋势明显、趋势不明显、局部数据骤变等不同类型的时间序列数据。对股票市场指数、汇率、GDP数据集等进行实验,结果表明该算法具有一定的普遍适用性,实验通过建立全极点序列和设置阈值压缩后的极点序列,以极点压缩率、损失率增强了算法的可伸缩性和扩展性,从而可以进一步适应不同数据类型的时间序列数据处理和研究的需求。In the context of big data,combined with the characteristics of time series,the amount of data presents a multi-dimensional sharp increase.Extreme points play a very important role in data analysis and prediction,such as easing the pressure of big data analysis.However,traditional extreme point extraction methods have some defects such as incomplete or incorrect extraction.In view of this,after the normalization of time series data,this paper improves the extreme point extraction algorithm according to the particularity of extreme points including crossover intervals,obvious trend and not obvious trend,and then analyzes the advantages of improved extreme point extraction algorithm in equal length interval.Finally,the paper makes comparisons on the extreme point extraction effect based on the optimized trend adaptive processing through experiments.The results show that the algorithm is adaptable to different types of time series data such as obvious trend,not obvious trend and sudden change of local data.Experiments on stock market index,exchange rate and GDP data set show that the algorithm has universal applicability.By establishing the total extreme point sequence and setting the threshold compressed extreme point sequence in the experiment,the scalability and expansibility of the algorithm are enhanced by the extreme point compression rate and loss rate,so as to further adapt to the needs of different data types of time series data processing and research.

关 键 词:时间序列 极值点 数据压缩 预处理 

分 类 号:TB115[理学—数学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象