检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:罗森林[1] 毛焱颖 潘丽敏[1] 陈倩柔 魏超[1] LUO Sen-lin;MAO Yan-ying;PAN Li-min;CHEN Qian-rou;WEI Cao(Information System and Security&Countermeasures Experimental Center,Beijing Institute of Technology, Beijing 100081,China)
机构地区:[1]北京理工大学信息系统及安全对抗实验中心,北京100081
出 处:《北京理工大学学报》2018年第11期1156-1162,1176,共8页Transactions of Beijing Institute of Technology
基 金:北京理工大学基础研究基金资助项目(20160542013);国家"二四二"计划项目(2017A149)
摘 要:针对文本情感分类中情感语义特征利用不足、特征降维效果欠佳等影响分类效果的问题,提出了一种通过扩展语义相似的情感词以及引入词语间统计特征的高精度网络评论情感分类方法.该方法利用神经网络Skipgram模型生成词嵌入,通过词嵌入相似性度量将语义相似的词语扩展为情感特征;再利用词语间的统计特征进行特征降维;通过多个弱分器加权构建Adaboost分类模型实现网络评论情感分类.基于酒店评论和手机评论公开测试集进行实验,结果表明其情感分类的正确率分别达到90.96%和93.67%.方法扩展语义相似情感词有利于丰富文本情感语义特征,引入词语间的统计特征有更好的特征降维效果,可以进一步提升文本情感分类的效果.To solve the effect problem of sentiment classification due to the insufficient use of emotional semantic features and unpromising dimension reduction effects,a novel high-precision sentiment classification method was proposed in this paper for online comments by extending semantic similar emotional words and employing the statistical features between words.Firstly,a neural network skip-gram model was employed to generate word embedding and extend the semantic similar words to emotional feature by the measure of embedding word similarity.Then the feature dimension was reduced by employing the statistical features between words.At last,sentiment classification of online comments was carried out by the Adaboost classification model which was constructed by weighting multiple weak classifiers.Experiment results on hotel reviews and mobile comments show that,the accuracy of sentiment classification with new method can reach 90.96%and 93.67%respectively.Expanding semantic similarity emotion words is helpful to enrich the semantic features of emotion.Employing statistical features between words has better feature reduction effect.Both two procedures effectively improve the performance of text sentiment classification.
关 键 词:词嵌入 Adaboost分类模型 特征选择 中文评论 情感分类
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.117.101.130