检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:彭仁杰 余正涛[1,2] 高盛祥[1,2] 李云龙 郭军军[1,2] 赵培莲 PENG Ren-jie;YU Zheng-tao;GAO Sheng-xiang;LI Yun-long;GUO Jun-jun;ZHAO Pei-lian(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China;Yunnan Key Laboratory of Artificial Intelligence,Kunming University of Science and Technology,Kunming 650500,China)
机构地区:[1]昆明理工大学信息工程与自动化学院,昆明650500 [2]昆明理工大学云南省人工智能重点实验室,昆明650500
出 处:《小型微型计算机系统》2021年第12期2561-2566,共6页Journal of Chinese Computer Systems
基 金:国家重点研发计划项目(2018YFC0830105,2018YFC0830101,2018YFC0830100)资助;国家自然科学基金项目(61972186,61762056,61472168)资助。
摘 要:话题模型已被广泛用于文本话题的发现.但是在案件话题领域,这些方法生成的话题与案件相关性不高,可解释性比较差,导致话题生成质量不高.为了解决这些问题,本文提出了基于案件要素指导下的话题优化方法:首先利用案件要素信息对话题模型进行改进,结合案件要素与BTM话题模型特征向量,将文档词与案件要素的相关性与BTM话题模型的话题分布结合,获得案件微博中与案件更相关的话题词,通过选取与案件相关的候选词来表征话题;最后再计算案件话题候选词与文本词之间的相关性和文档与案件要素的相似度,得到案件话题词集.通过对新浪微博数据集的对比实验及结果说明,能够显著改善案件话题的发现质量.Topic models have been widely used for the discovery of text topics. However,in the field of case,the topics generated by these methods are not highly relevant to the case,and their interpretability is poor,resulting in low quality of topic generation. In order to solve these problems,this article proposes a topic optimization method based on case elements: first,the case element information is used to improve the topic model,combining the case element and the BTM topic model feature vector,the correlation between the document word and the case element and the BTM The topic distribution of the topic model is combined to obtain the topic words that are more relevant to the case in the case Weibo,and the topic is selected by selecting candidate words related to the case;Finally,the correlation between case topic candidates with text words and the document Similarity with the case elements,the case topic word set is obtained. The comparative experiments on the Sina Weibo dataset and the results show that it can significantly improve the quality of discovery of case topics.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.15.140.134