检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:Hong YIN Shu-qiang YANG Xiao-qian ZHU Shao-dong MA Lu-min ZHANG
机构地区:[1]College of Computer, National University of Defense Technology [2]Xiangyang School for NCOs [3]School of Engineering, University of Hull
出 处:《Frontiers of Information Technology & Electronic Engineering》2015年第9期744-758,共15页信息与电子工程前沿(英文版)
基 金:supported by the National High-Tech R&D Program(863)of China(Nos.2012AA012600,2011AA010702,2012AA01A401,and 2012AA01A402);the National Natural Science Foundation of China(No.60933005);the National Science and Technology of China(No.2012BAH38B04)
摘 要:The symbolic representation of time series has attracted much research interest recently. The high dimensionality typical of the data is challenging, especially as the time series becomes longer. The wide distribution of sensors collecting more and more data exacerbates the problem. Representing a time series effectively is an essential task for decision-making activities such as classification, prediction, and knowledge discovery. In this paper, we propose a new symbolic representation method for long time series based on trend features, called trend feature symbolic approximation (TFSA). The method uses a two-step mechanism to segment long time series rapidly. Unlike some previous symbolic methods, it focuses on retaining most of the trend features and patterns of the original series. A time series is represented by trend symbols, which are also suitable for use in knowledge discovery, such as association rules mining. TFSA provides the lower bounding guarantee. Experimental results show that, compared with some previous methods, it not only has better segmentation efficiency and classification accuracy, but also is applicable for use in knowledge discovery from time series.目的:提出一种通用方法用于长时间序列的知识发现过程。创新点:提出一种基于并行分割的时间序列符号化方法----趋势特征符号化近似法(trend feature symbolic approximation,TFSA),对长时间序列进行快速分割,并且保留原始序列大多数趋势特征,将分割后的子序列用特征符号表示。本文的贡献在于改进了长时间序列的分割效率,而且TFSA专注于保留原始时间序列的大多数趋势特征,使得挖掘后的规则更加容易理解和解释。方法:首先,通过一个两步(two-step)分割机制将时间序列分割成一系列不等长的子序列。然后,采用趋势特征符号化近似(TFSA)将子序列符号化并获得符号项集。最后通过一个基于apriori的关联规则算法来实现时序数据的知识发现。结论:针对长时间序列,基于累积和控制图方法研究一种海量数据环境下序列的并行分割机制。可以通过分布式结点来实现,随结点数增加,其效率将进一步提高。TFSA符号化方法不同于传统的方法,它致力于保留原始时间序列的大部分趋势特征及模式,通过规定的趋势符号来表示时间序列,并且其表达方式也考虑后续的时间序列挖掘研究。实验证明,本文方法在时间序列的分割效率以及分类准确性上相比于已有的方法均有所提高。
关 键 词:Long time series SEGMENTATION Trend features SYMBOLIC Knowledge discovery
分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.28