基于情感词典和集成学习的情感极性分类方法  被引量:22

Sentiment polarity classification method based on sentiment dictionary and ensemble learning

在线阅读下载全文

作  者:朱军 刘嘉勇[1] 张腾飞 邱利茂 ZHU Jun;LIU Jiayong;ZHANG Tengfei;QIU Limao(Inforrmation Security lnsttitute,Sichuan University,Chengdu Sichuan 610065,China)

机构地区:[1]四川大学信息安全研究所,成都610065

出  处:《计算机应用》2018年第A01期95-98,107,共5页journal of Computer Applications

摘  要:当前情感极性分析时使用机器学习方法进行褒贬分析需要完备的语料库,但对特定领域的语料库构建困难,而只使用情感词典的分类方法准确率低。针对以上缺点提出了一种改进的机器学习方法和情感词典结合的集成学习情感极性分类方法。首先,使用Word2Vec特征提取方法将每条评论表示成固定维度向量,使用常见的机器学习分类方法进行分类,找出效果最好的分类方法;然后使用基于情感词典的朴素贝叶斯分类方法进行情感极性分类。最后将基于情感词典和集成学习的方法相结合,使用谭松波公开的数据集酒店评论数据进行实验。理论分析和实验表明,使用Word2Vec作为特征提取方法的支持向量机(SVM)分类方法结合基于情感词典的朴素贝叶斯分类方法的集成学习方法可以将积极类的准确率和宏平均分别提高6. 9个百分点和3个百分点,将消极类的召回率和宏平均分别提高8. 8个百分点和5. 1个百分点,有效提升了情感极性分类效果。At present, the use of machine learning method in the analysis of sentiment polarity in natural language processing requires a complete corpus, but it is difficult to construct a corpus in a specific domain. However, the classification method of only using sentiment dictionary has low accuracy. To solve the above shortcomings, this paper proposed an improved ensemble learning method for sentiment polarity calssification which integrated machine learning and sentiment dictionary. Firstly, the feature extraction method Word2Vec was used to express each comment as a vector with fixed dimension, the common machine learning classification methods were used for classification, and the best classification method was found. Then the Naive Bayesian (NB) classification method based on sentiment dictionary was used to classify sentiment polarity. Finally, based on the sentiment dictionary and ensemble learning method, using an open dataset hotel review data opened by Tan Songbo, theoretical analysis and experimental results show that the accuracy and the macro average of positive class is increased by 6.9 percentage points and 3 percentage points, the recall rate and the macro average of negative class is increased by 8.8 percentage points and 5.1 percentage points. The proposed method can effectively improve the emotional polarity classification effect.

关 键 词:情感极性分类 机器学习 情感词典 Word2Vec 朴素贝叶斯 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象