检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]安徽大学计算机科学与技术学院,安徽合肥230601
出 处:《计算机应用与软件》2017年第9期253-256,305,共5页Computer Applications and Software
基 金:国家自然科学基金项目(61202227);国家科技支撑计划项目(2015BAK24B01)
摘 要:时间序列的特征表示与相似性度量是时间序列数据挖掘的重要基础。针对现有的序列表示方法难以具体反映序列的形态变化趋势,导致相似度量结果不精确的问题,提出一种新的基于形态模式的相似性度量算法。该算法在分段线性表示的基础上,根据序列在不同时段的斜率变化情况,划分序列的分段形态模式并用特殊的字符进行表示,把时间序列转换成字符串序列,利用最长公共子序列方法计算字符串序列的距离作为时间序列之间的距离。最后通过实验验证该方法的有效性。理论分析和实验证明该方法对数据点的值不敏感,能够减少噪声的干扰,而且具有较高的准确性。Feature representation and similarity measure of time series is an important foundation of time series data mining. Aiming at the problem that the existing sequence representation method is difficult to reflect the morphological change of the sequence, which leads to the inaccuracy of the similarity measurement results, a new similarity measurement algorithm based on morphological patterns is proposed. On the basis of piecewise linear representation,according to the sequence of slope changes in different periods,the algorithm divides the sequence into different morphological pattern and expresses them with special characters. The time sequence is converted into a sequence of strings. The longest common subsequence method is adopted to calculate the distance of string sequences as the distance between time series. Finally,the effectiveness of the proposed method was verified by experiments. Theoretical analysis and experiments show that the method is insensitive to the value of the data points,which can reduce the interference of noise and has high accuracy.
分 类 号:TP3[自动化与计算机技术—计算机科学与技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.137.172.252