检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:白杨[1] BAI Yang(School of Information Engineering,Eastern Liaoning University,Dandong 118003,China)
出 处:《辽东学院学报(自然科学版)》2020年第2期127-130,共4页Journal of Eastern Liaoning University:Natural Science Edition
基 金:辽东学院博士科研启动基金项目(2019BS025)。
摘 要:对用户生成内容的挖掘分析是获得微博主题的有效方法。针对用户标签使用匮乏的冷启动问题,提出融合用户标签和主题标签的微博主题生成方法。首先,以用户标签作为特征项,对用户稀疏向量进行压缩并计算用户标签相似度。其次,采用LDA主题模型对用户的微博进行主题抽取,生成微博主题标签。再次,将两种标签进行融合,建立用户标签主题相似度模型,以此获得微博的主题。最后,在微博数据集上对所提出的方法进行了应用,获得了以热门标签形式表征的微博主题,这一结果与采用LDA主题模型生成的微博主题基本一致,而获得的热门标签更具有主题代表性。The mining and analysis of user-generated content is an effective way to obtain microblog topic.Aiming at the problem of cold start due to the lack of user tags,this author proposes a microblog topic generation method by integrating user tags and topic tags.Firstly,the user tag was used as the feature item to compress the sparse vector and calculate the similarity of user tag.Secondly,LDA topic model was used to extract the topic of user's microblog by which the topic tag was generated.Thirdly,the two kinds of tags were integrated to establish the similarity model of user tag topic and obtain the topic of microblog.The proposed method was applied to the microblog data set and the microblog topic represented by hot tags was obtained.While it is basically consistent with the results gotten with LDA topic model,it is more representative.
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.44