一种基于标签融合的微博主题生成方法  

A Microblog Topic Generation Method Based on Tag Integration

在线阅读下载全文

作  者:白杨[1] BAI Yang(School of Information Engineering,Eastern Liaoning University,Dandong 118003,China)

机构地区:[1]辽东学院信息工程学院,辽宁丹东118003

出  处:《辽东学院学报(自然科学版)》2020年第2期127-130,共4页Journal of Eastern Liaoning University:Natural Science Edition

基  金:辽东学院博士科研启动基金项目(2019BS025)。

摘  要:对用户生成内容的挖掘分析是获得微博主题的有效方法。针对用户标签使用匮乏的冷启动问题,提出融合用户标签和主题标签的微博主题生成方法。首先,以用户标签作为特征项,对用户稀疏向量进行压缩并计算用户标签相似度。其次,采用LDA主题模型对用户的微博进行主题抽取,生成微博主题标签。再次,将两种标签进行融合,建立用户标签主题相似度模型,以此获得微博的主题。最后,在微博数据集上对所提出的方法进行了应用,获得了以热门标签形式表征的微博主题,这一结果与采用LDA主题模型生成的微博主题基本一致,而获得的热门标签更具有主题代表性。The mining and analysis of user-generated content is an effective way to obtain microblog topic.Aiming at the problem of cold start due to the lack of user tags,this author proposes a microblog topic generation method by integrating user tags and topic tags.Firstly,the user tag was used as the feature item to compress the sparse vector and calculate the similarity of user tag.Secondly,LDA topic model was used to extract the topic of user's microblog by which the topic tag was generated.Thirdly,the two kinds of tags were integrated to establish the similarity model of user tag topic and obtain the topic of microblog.The proposed method was applied to the microblog data set and the microblog topic represented by hot tags was obtained.While it is basically consistent with the results gotten with LDA topic model,it is more representative.

关 键 词:用户生成内容 主题模型 用户标签 用户相似度 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象