检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:吴迪[1] 马文莉 杨利君 WU Di;MA Wen-li;YANG Li-jun(School of Information and Electrical Engineering,Hebei University of Engineering,Handan 056038,China)
机构地区:[1]河北工程大学信息与电气工程学院,河北邯郸056038
出 处:《计算机工程与设计》2023年第12期3800-3808,共9页Computer Engineering and Design
基 金:河北省自然科学基金项目(F2020402003)。
摘 要:针对微博短文本具有时效性和建模中频词缺失的问题,提出一种遗忘曲线和BTM词频双层加权微博用户画像方法。通过计算词条的时间权重和提高中频词的词频权重,获取双层加权的用户兴趣主题词。利用遗忘曲线拟合时间函数,计算微博词条的时间权重;将重新计算的词频特征作为Gibbs采样的随机值,提出一种改进的词频加权BTM主题模型,提高中频词的词频权重;提出一种微博用户行为影响力计算方法,构建热点话题下的用户画像。实验结果表明,该方法与BTM、SL-LDA、LDA方法相比,在不同时间片PMI-score指标性能均最优,能够准确挖掘不同时间片的各主题词,构建热点话题下用户兴趣主题词词云,准确展示热点话题下的用户兴趣。Aiming at the problem of the timeliness of microblog short text and the lack of frequent words in modeling,the forgetting curve and BTM word frequency double-weighted microblog user portrait was proposed.The double-weighted user interest keywords were obtained by calculating the time weight of the entry and increasing the word frequency weight of the intermediate frequency words.The forgetting curve was used to fit the time function to calculate the time weight of microblog entries.Gibbs sampling was carried out in combination with word frequency features,and a word frequency weighted BTM topic model was proposed to improve the word frequency weight of intermediate frequency words.Behavior influence was proposed to construct user portraits under hot topics.Experimental results show that,compared with the BTM,SL-LDA and LDA methods,the proposed method has the best performance under the PMI-score index of different time slices,and it can accurately mine the subject words of each topic in different time slices,and build user interest topic word cloud under hot topics,accurately showing user interests under hot topics.
关 键 词:微博 用户画像 双层加权 遗忘曲线 时间函数 词对主题模型 行为影响力
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.147