基于子序列相似性的时间序列语义挖掘算法  被引量:3

Time-Series Semantic Mining Algorithm Based on Sub-Series Similarity

在线阅读下载全文

作  者:陆怡 王鹏[2] 汪卫[2] LU Yi;WANG Peng;WANG Wei(School of Software,Fudan University,Shanghai 201203,China;School of Computer Science,Fudan University,Shanghai 201203,China)

机构地区:[1]复旦大学软件学院,上海201203 [2]复旦大学计算机科学技术学院,上海201203

出  处:《计算机工程》2022年第10期88-94,共7页Computer Engineering

基  金:国家重点研发计划(2020YFB1710001)。

摘  要:时间序列是对某个事物或系统进行连续同间隔测量得到的数值序列,挖掘时间序列中潜在的语义信息对于发现系统运行规律或识别系统突发异常至关重要,然而目前多数时间序列语义挖掘算法对于时间序列数据特征有一定的约束条件,难以处理海量且特征各异的时间序列数据。针对该问题,提出一种基于子序列相似性的时间序列语义挖掘算法。通过计算子序列的相似性,将时间序列分割成片段序列进行两级聚类,识别出时间序列中潜在的物理状态。引入基于概率的迭代模式,根据候选分段情况动态调整子序列被选为参考子序列的概率,保证参考子序列涵盖全部物理状态。实验结果表明,该算法在PAMAP、Barbet等5个真实数据集上的识别准确率均超过90%,相比于FLUSS、pHMM、AutoPlait算法具有更高的识别准确率与运行效率以及更强的通用性。Time-series is a sequence of values obtained by continuously measuring an object or system at the same interval.By obtaining potential semantic information in the time-series,the regularities or anomalies of a system can be identified,which can provide guidance for practice and analysis.However,most current time-series semantic mining algorithms are constrained by some of the characteristics of time-series data,and addressing a significant amount of time-series data with different characteristics is difficult.Hence,a time-series semantic mining algorithm based on sub-series similarity is proposed herein.First,by calculating the similarity of sub-series,the algorithm partitions the time-series into segment sequences for two-level clustering and identifies the underlying physical states in the time-series.Second,the algorithm introduces an iterative mode based on probability,dynamically adjusts the probability of a sub-series selected as a reference sub-series based on the candidate segmentation,and ensures that the reference sub-series includes all physical states.Experimental results show that the recognition accuracy of the algorithm on five real data sets such as PAMAP and Barbet exceeds 90%.Compared with FLUSS,pHMM,and AutoPlait algorithms,the proposed algorithm demonstrates higher recognition accuracy,operating efficiency,and versatility.

关 键 词:时间序列 语义挖掘 相似性度量 聚类 k最近邻 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象