基于社交网络分析和LDA主题挖掘的短文本挖掘研究  被引量:5

Research on short text mining based on social network analysis and LDA topic mining

在线阅读下载全文

作  者:武帅 施奕 杨秀璋 项美玉 WU Shuai;SHI Yi;YANG Xiuzhang;XIANG Meiyu(School of Information,Guizhou University of Finance and Economics,Guiyang 550025,China;Lianshui County High-level Talent Development Center,Huaian 223200,China;Guiyang Institute for Big Data and Finance,Guizhou University of Finance and Economics,Guiyang 550025,China)

机构地区:[1]贵州财经大学信息学院,贵州贵阳550025 [2]涟水县高层次人才发展中心,江苏淮安223200 [3]贵州财经大学贵阳大数据金融学院,贵州贵阳550025

出  处:《现代电子技术》2022年第20期124-128,共5页Modern Electronics Technique

基  金:贵州省科技计划项目(黔科合基础[2019]1041);贵州省科技计划项目(黔科合基础[2019]1403);贵州省科技计划项目(黔科合基础[2020]1Y279);贵州省科技计划项目(黔科合基础[2020]1Y420);贵州省教育厅青年科技人才成长项目(黔教合KY字[2016]175;黔教合KY字[2021]135);贵州财经大学2019年度校级项目(2019XQN01)。

摘  要:随着自媒体技术的不断发展,如何高效挖掘短文本数据信息已成为现阶段的研究重点。传统主题挖掘方法进行短文本数据分析时,仅考虑单位词出现频率进行判断,未考虑语义关联结构信息,分析效果欠佳。针对短文本数据的稀缺性,文中提出一种基于社交网络分析和LDA的主题挖掘分析模型。首先结合共词分析算法,分析不同文档间主题词的关系;然后结合社交网络分析算法,提高共词网络主题词耦合度;再借助隐含空间模型对共词网络进行降维,提高社交网络耦合性;最后结合隐含位置聚类算法发掘潜在社区,提高主题识别效果。实验结果表明,所提方法能够在一定程度上优化主题挖掘算法在识别短文本主题的效果,便于进行短文本研究,具有实用价值,也可为后续应用于前沿主题识别提供参考。With the continuous development of self-media technology,how to efficiently mine short text data information has become the current research focus. When the traditional topic mining methods are used for short text data analysis and research, they only consider the frequency of unit words for judgment, and do not consider semantic related structure information,so the analysis effect is not good. In allusion to the scarcity of short text data,a topic mining analysis model based on social network analysis and LDA is proposed. The relationship between the subject words of different documents is analyzed by means of the co-word analysis algorithm,and the coupling degree of the co-word network subject words is improved by means of the social network analysis algorithm. The implicit space model is used to reduce the dimensionality of the co-word network to improve the coupling of social networks. The hidden location clustering algorithm is used to explore potential communities and improve the topic recognition effect. The experimental results show that the method proposed in this paper can optimize the effect of topic mining algorithm in identifying short text topics to a certain extent,and is convenient for researchers to conduct short text research. It has practical value,and can also provide reference for subsequent application in cutting-edge topic recognition.

关 键 词:LDA主题挖掘 共词分析 社交网络分析 短文本挖掘 隐含空间模型 隐含位置聚类 主题识别 吉布斯抽样 

分 类 号:TN911-34[电子电信—通信与信息系统] TP391[电子电信—信息与通信工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象