基于快速高效用项集挖掘的大规模消息流预测算法研究与应用  被引量:2

STUDY AND APPLICATION OF PREDICTION ALGORITHM OF LARGE-SCALE MESSAGES STREAMS BASED ON FAST AND HIGH UTILITY ITEMSET MINING

在线阅读下载全文

作  者:穆晓芳[1] 邓红霞 郭虎升[3] 赵鹏[1] Mu Xiaofang;Deng Hongxia;Guo Husheng;Zhao Peng(Department of Computer Science,Taiyuan Normal University,Taiyuan 030619,Shanxi,China;College of Information and Computer,Taiyuan University of Technology,Taiyuan 030024,Shanxi,China;School of Computer and Information Technology,Shanxi University,Taiyuan 030006,Shanxi,China)

机构地区:[1]太原师范学院计算机系,山西太原030619 [2]太原理工大学信息与计算机学院,山西太原030024 [3]山西大学计算机与信息技术学院,山西太原030006

出  处:《计算机应用与软件》2019年第11期243-249,共7页Computer Applications and Software

基  金:国家自然科学基金项目(61503229);山西省重点研发计划项目(201803D31055);山西省应用基础研究计划项目(201801D121135)

摘  要:为了提高大规模消息流话题预测的准确性与效率,提出基于高效用项集挖掘的消息流话题预测算法.计算时间窗口中词汇的内部效用与外部效用,根据会话内所有词汇的效用计算最小效用值;采用高效用项集挖掘算法产生候选话题模式集,随之提取最终的话题模式.为了提高高效用项集挖掘的时间效率与存储效率,设计三角项集效用树保存项集的效用信息,设计话题搜索树保存候选话题模式集.最终基于真实消息流数据集进行实验,结果显示该算法有效地提高了话题预测的准确率,并且实现了较快的响应时间.To improve the accuracy and efficiency of topic prediction of large-scale messages streams,the paper proposes a topic prediction of messages streams based on high utility itemset mining.We computed the internal utility and external utility of words in the time window,and computed the minimum utility threshold according to the utilities of all words in the transaction.Candidate topic patterns were generated through high utility itemset mining algorithm,and the final topic patterns were extracted.In order to improve the time efficiency and storage efficiency of high utility itemset mining algorithm,we designed the triangle itemset utility tree to store the utility information of itemsets,and designed the topic search tree to store candidate topic patterns.Experiments were carried out based on the real messages streams datasets.The results show that the proposed algorithm effectively improves the prediction accuracy of topics with a short responding time.

关 键 词:高效用项集挖掘 频繁项集挖掘 数据流 话题预测 大数据 网络安全 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象