检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王兴[1,2] 吴艺 林劼 卓一帆 WANG Xing1,2, WU Yi2, LIN Jie2, ZHUO Yi-Fan2 1(School of Information Science and Engineering, Central South University, Changsha 410075, China) 2(College of Mathematics and Informatics, Fujian Normal University, Fuzhou 350108, China)
机构地区:[1]中南大学信息科学与工程学院,长沙410075 [2]福建师范大学数学与信息学院,福州350108
出 处:《计算机系统应用》2018年第4期10-17,共8页Computer Systems & Applications
基 金:国家自然科学基金(61472082);福建省自然科学基金(2014J01220)
摘 要:如何快速有效对历史数据进行统计建模和规律挖掘具有重要意义.鉴于模型在实际数据挖掘应用的局限及马尔科夫模型的良好统计特性,设计实现了基于后缀数组和后缀自动机的变阶马尔科夫模型.算法在后缀树形结构实现的基础上,引入后缀链,实现各状态子序列的快速跳转,能动态自适应计算不同阶长概率的需求.实验结果表明:相比传统马尔科夫模型,模型能在线性时间和空间复杂度内,构建历史数据的概率统计特征及各状态后缀子序列之间的链接关系,大大降低了存储空间和时间,能实现大规模数据的在线学习和应用.It is of great significance how to model and mine historical data quickly and effectively. Based on the statistical characteristics of Markov model, this study designs and implements a variable order Markov model based on suffix array and suffix automata, in view of the limitations of the model in practical data mining applications. Based on the realization of suffix tree structure, the suffix chain is introduced to realize the quick jump of each state subsequence, and the requirement of different order length probability can be dynamically and adaptively calculated. The experimental results show that compared with the traditional Markov model, the model constructs the link between suffix sequence characteristics of probability and statistics of historical data and the state in linear time and space complexity, which can greatly reduce the storage space and time, and realize online learning and application of large data.
关 键 词:马尔科夫模型 变阶马尔科夫模型 字典树 后缀数组 后缀自动机
分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.177