基于隐半马尔可夫模型的微博流行信息检测方法  被引量:1

Microblog Popular Information Detection Based on Hidden Semi-Markov Model

在线阅读下载全文

作  者:谢柏林 黎琦[1,2] 邝建 XIE Bai-lin;LI Qi;KUANG Jiang(School of Information Science and Technology,Guangdong University of Foreign Studies,Guangzhou 510006,China;School of Cyber Security,Guangdong University of Foreign Studies,Guangzhou 510006,China)

机构地区:[1]广东外语外贸大学信息科学与技术学院,广州510006 [2]广东外语外贸大学网络空间安全学院,广州510006

出  处:《计算机科学》2022年第S01期291-296,共6页Computer Science

基  金:广东省基础与应用基础研究基金(2018A0303130045);广州市科技计划项目(201904010334)。

摘  要:目前微博已成为人们发布信息和获取信息的一个重要平台。为了及早发现微博上的流行信息,以便及时发现微博上的热点事件,同时及时发现、抑制谣言信息的传播,使微博在网民的信息获取和信息发布中发挥更积极的作用,文中提出了一种基于隐半马尔可夫模型的微博流行信息检测方法。该方法以信息转发者的影响力等级和相邻两个转发者的时间间隔构建观测值,使用随机森林分类算法来自动得到转发者的影响力等级,利用隐半马尔可夫模型来刻画流行信息的传播过程,基于此来及早发现潜在的流行信息。该方法分为模型训练和流行信息检测两个阶段,在流行信息检测阶段,计算每条信息在传播过程中产生的观测序列相对于模型的平均对数似然概率,实时更新每条信息的流行度。使用采集的新浪微博数据集和Twitter数据集对所提方法进行了测试,实验结果表明了该方法的有效性。In recent years,microblog has become great places for people to communicate with each other and share knowledge.However,microblog has also become the main grounds for rumors’transmission.If we can identify popular information in early stage,then we can identify and quell rumors early,we can also identify hot topics early in microblog.Therefore,the research on popular information detection is important.In this paper a new method is presented for identifying popular information based on hidden semi-Markov model(HSMM),from the perspective of the transmission processes of popular information in microblog.In this method,the observation value is constructed based on the influence level of the information forwarder and the time interval between two adjacent forwarders,and the influence level of the forwarder is automatically obtained by using the random forest classification algorithm.The proposed method includes a training phase and an identification phase.In the identification phase,the average log likelihood of every observation sequence is calculated,and the popularity of information is updated in real time.So this method can identify the popular information in early stage.An experiment based on real datasets of Sina Weibo and Twitter is conducted to evaluate this method.The experiment results validate the effectiveness of this method.

关 键 词:微博 流行信息 隐半马尔可夫模型 流行度 传播过程 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象