特征词选择与相似度融合的微博话题发现方法

A method of micro-blog topic discovery based on feature words selection and text similarity

作　　者：陈红阳[1] 汪林林[1] 陈滢生[1] 鲁江坤左雪[1] CHEN Hongyang WANG Linlin CHEN Yingsheng LU Jiangkun ZUO Xue(College of Computer Engineering, Chongqing College of Humanities Science and Technology, Chongqing 401524, China)

机构地区：[1]重庆人文科技学院计算机工程学院,重庆401524

出　　处：《电信科学》2017年第10期134-140,共7页Telecommunications Science

基　　金：重庆市教委科技计划项目(No.KJ1601601);重庆市重点产业共性关键技术创新专项项目(No.cstc2015zdcy-ztzx40007);国家自然科学基金资助项目(No.61173184)~~

摘　　要：微博短文本中存在一些相同或相近、但与主题关系不大的词项,对准确度量文本之间的相似性具有较大的干扰作用,影响微博话题被发现的质量。提出一种基于文本内容与结构化信息相结合的特征词选择算法,能有效提取具有代表性的特征词,并对文本、话题间相似度的计算策略进行改进,然后将特征词选择算法与相似度计算方法融合,应用于微博文本数据实现话题发现。实验结果表明,本算法能有效降低话题发现的平均漏检率与误检率,提高话题发现质量。Some words existing in micro-blog short text have a bad effect on the accuracy of text similarity calcula- tion, further affecting the quality of topic discovery. And these words are the same in shape or semantic meaning, but remote from the topic. A novel method of feature words selection based on micro-blog short text content and struc- tured information was proposed, which could effectively choose some important feature words from the text. Moreo- ver, in computing the similarity between texts, an improvement on computing the similarity between the text and the topic was made. Finally, the methods were combined together and applied to discover micro-blog topics. Experimen- tal results show that the new method of topic discovery can effectively reduce the average missing rate and false de- tection rate, and improve the quality of topic discovery.

关键词：微博特征词选择相似度话题发现

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

特征词选择与相似度融合的微博话题发现方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

特征词选择与相似度融合的微博话题发现方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索