基于文本语义和表情倾向的微博情感分析方法  被引量:23

Microblogging sentiment analysis method based on text semantics and expression tendentiousness

在线阅读下载全文

作  者:王文[1,2] 王树锋[1,2] 李洪华[1] 

机构地区:[1]常州工学院计算机信息工程学院,江苏常州213002 [2]常州工学院常州市软件技术研究与应用重点实验室,江苏常州213002

出  处:《南京理工大学学报》2014年第6期733-738,749,共7页Journal of Nanjing University of Science and Technology

基  金:常州工学院校级科研基金项目(YN1316;YN1203)

摘  要:针对基于机器学习的中文微博情感分析方法存在处理过程复杂、判断准确率低等问题,该文提出了一种新的情感分析方法。将微博爬虫和Web应用程序编程接口(API)相结合,对动态微博数据进行收集和预处理。基于NTUSD和How Net中文情感词典的微博情感词的抽取和分类,计算词语语义相似度和倾向性。综合考虑表情、文本情感倾向的加权和正面情感增强等因素。实验结果表明:表情情感倾向对微博情感倾向起着重要作用;在表情和文本情感倾向比值固定的情况下,调整因素和中性区间的选择会对情感倾向判断准确率产生影响;通过与基于How Net语义相似度的计算模型比较,该文方法使得情感倾向判断准确率提高约5%。Aiming at the problems of complex treatment works and low accuracy of the sentiment analysis method of Chinese microblogging based on is proposed here. The dynamic microblogging data crawlers and Web application programming machine-learning, a new sentiment analysis method are collected and pretreated by combining Weibo interface(API). The semantic similarity and tendentiousness are calculated based on the extraction and classification of microblogging emotional words of Chinese sentiment word dictionaries NTUSD and HowNet. The weightings of expression and text emotional tendentiousness, the increase of positive emotion and other factors are considered. Experimental data show that:expression tendentiousness plays a vital role on microblogging emotional tendentiousness; the reasonable setting of adjustment factors and neutral thresholds can improve the accuracy of sentiment analysis better when the ratio of expression and text emotional tendentiousness is fixed; compared with the calculation model based on HowNet semantic similarity, the adjustment accuracy of emotional tendentiousness of the sentiment analysis method proposed here is improved by about 5%.

关 键 词:文本语义 表情倾向 微博 情感分析 机器学习 微博爬虫 应用程序编程接口 情感词典 语义相似度 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象