遗忘曲线和BTM词频双层加权微博用户画像  被引量:2

Forgetting curve and BTM word frequency doubleweighted microblog user portrait

在线阅读下载全文

作  者:吴迪[1] 马文莉 杨利君 WU Di;MA Wen-li;YANG Li-jun(School of Information and Electrical Engineering,Hebei University of Engineering,Handan 056038,China)

机构地区:[1]河北工程大学信息与电气工程学院,河北邯郸056038

出  处:《计算机工程与设计》2023年第12期3800-3808,共9页Computer Engineering and Design

基  金:河北省自然科学基金项目(F2020402003)。

摘  要:针对微博短文本具有时效性和建模中频词缺失的问题,提出一种遗忘曲线和BTM词频双层加权微博用户画像方法。通过计算词条的时间权重和提高中频词的词频权重,获取双层加权的用户兴趣主题词。利用遗忘曲线拟合时间函数,计算微博词条的时间权重;将重新计算的词频特征作为Gibbs采样的随机值,提出一种改进的词频加权BTM主题模型,提高中频词的词频权重;提出一种微博用户行为影响力计算方法,构建热点话题下的用户画像。实验结果表明,该方法与BTM、SL-LDA、LDA方法相比,在不同时间片PMI-score指标性能均最优,能够准确挖掘不同时间片的各主题词,构建热点话题下用户兴趣主题词词云,准确展示热点话题下的用户兴趣。Aiming at the problem of the timeliness of microblog short text and the lack of frequent words in modeling,the forgetting curve and BTM word frequency double-weighted microblog user portrait was proposed.The double-weighted user interest keywords were obtained by calculating the time weight of the entry and increasing the word frequency weight of the intermediate frequency words.The forgetting curve was used to fit the time function to calculate the time weight of microblog entries.Gibbs sampling was carried out in combination with word frequency features,and a word frequency weighted BTM topic model was proposed to improve the word frequency weight of intermediate frequency words.Behavior influence was proposed to construct user portraits under hot topics.Experimental results show that,compared with the BTM,SL-LDA and LDA methods,the proposed method has the best performance under the PMI-score index of different time slices,and it can accurately mine the subject words of each topic in different time slices,and build user interest topic word cloud under hot topics,accurately showing user interests under hot topics.

关 键 词:微博 用户画像 双层加权 遗忘曲线 时间函数 词对主题模型 行为影响力 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象