基于案件要素的案件话题优化  

Case Topic Optimization Based on Case Elements

在线阅读下载全文

作  者:彭仁杰 余正涛[1,2] 高盛祥[1,2] 李云龙 郭军军[1,2] 赵培莲 PENG Ren-jie;YU Zheng-tao;GAO Sheng-xiang;LI Yun-long;GUO Jun-jun;ZHAO Pei-lian(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China;Yunnan Key Laboratory of Artificial Intelligence,Kunming University of Science and Technology,Kunming 650500,China)

机构地区:[1]昆明理工大学信息工程与自动化学院,昆明650500 [2]昆明理工大学云南省人工智能重点实验室,昆明650500

出  处:《小型微型计算机系统》2021年第12期2561-2566,共6页Journal of Chinese Computer Systems

基  金:国家重点研发计划项目(2018YFC0830105,2018YFC0830101,2018YFC0830100)资助;国家自然科学基金项目(61972186,61762056,61472168)资助。

摘  要:话题模型已被广泛用于文本话题的发现.但是在案件话题领域,这些方法生成的话题与案件相关性不高,可解释性比较差,导致话题生成质量不高.为了解决这些问题,本文提出了基于案件要素指导下的话题优化方法:首先利用案件要素信息对话题模型进行改进,结合案件要素与BTM话题模型特征向量,将文档词与案件要素的相关性与BTM话题模型的话题分布结合,获得案件微博中与案件更相关的话题词,通过选取与案件相关的候选词来表征话题;最后再计算案件话题候选词与文本词之间的相关性和文档与案件要素的相似度,得到案件话题词集.通过对新浪微博数据集的对比实验及结果说明,能够显著改善案件话题的发现质量.Topic models have been widely used for the discovery of text topics. However,in the field of case,the topics generated by these methods are not highly relevant to the case,and their interpretability is poor,resulting in low quality of topic generation. In order to solve these problems,this article proposes a topic optimization method based on case elements: first,the case element information is used to improve the topic model,combining the case element and the BTM topic model feature vector,the correlation between the document word and the case element and the BTM The topic distribution of the topic model is combined to obtain the topic words that are more relevant to the case in the case Weibo,and the topic is selected by selecting candidate words related to the case;Finally,the correlation between case topic candidates with text words and the document Similarity with the case elements,the case topic word set is obtained. The comparative experiments on the Sina Weibo dataset and the results show that it can significantly improve the quality of discovery of case topics.

关 键 词:话题模型 话题优化 案件要素 相似度计算 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象