基于权重微博链的改进LDA微博主题模型被引量：9

Improve LDA microblogs topics model based on weight microblogs chain

作　　者：李鹏[1] 于岩[1] 李英乐[1] 李星[1] 何赞园[1]

出　　处：《计算机应用研究》2016年第7期2018-2021,共4页Application Research of Computers

基　　金：国家科技支撑计划资助项目(2014BAH30B01)

摘　　要：社交网络尤其是微博中含有大量的短文本。短文本不同于传统的文本,其携带的语义特征信息密度低,很难对其进行准确的主题挖掘。针对这一问题,提出根据微博发布时间与原创、转发、评论微博等社交行为信息分配权重,使用背景知识丰富语义特征的微博链结构与基于此改进的LDA主题模型。实验结果表明,相比于标准的LDA模型,本模型的perplexity值更低,即具有较低的预测不确定度。There is a large number of short texts in social network especially microblogs. Short texts have very low information density which is different from traditional long texts, and it is impossible to mine topics of short texts accurately. To solve this problem,this paper proposed a weight microblogs chain structure that distributing weight according to mieroblogs published time and social activities information including publish ,comment and retweet activities, and took background knowledge to enrich semantic features of this structure and the improved LDA topics model based on this microblogs chain structure（WMC-LDA）. The experiments show that this model has smaller perplexity than standard LDA,namely this model has low predict uncertainty.

关键词：短文本主题挖掘微博链潜在狄利克雷分布 PERPLEXITY

分类号：TP399[自动化与计算机技术—计算机应用技术] TP391[自动化与计算机技术—计算机科学与技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于权重微博链的改进LDA微博主题模型被引量：9

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于权重微博链的改进LDA微博主题模型 被引量：9

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于权重微博链的改进LDA微博主题模型被引量：9