情感文本分类混合模型及特征扩展策略  被引量:4

A hybrid approach to sentiment classification and feature expansion strategy

在线阅读下载全文

作  者:夏睿[1] 宗成庆[1] 

机构地区:[1]中国科学院自动化研究所,北京100190

出  处:《智能系统学报》2011年第6期483-488,共6页CAAI Transactions on Intelligent Systems

基  金:国家"863"计划资助项目(2008AA01Z148);黑龙江省杰出青年科学基金资助项目(JC200703);哈尔滨市科技创新人才研究专项基金资助项目(2007RFXXG009)

摘  要:针对篇章级别情感文本分类问题,分析了传统的生成式模型和判别式模型的性能,提出了一种级联式情感文本分类混合模型以及句法结构特征扩展策略.在该模型中,生成式模型(朴素贝叶斯分类器)和判别式模型(支持向量机)以级联的方式进行组合,旨在消除对于分类临界样本,模型判决置信度不足引起的误差.在混合模型的基础上,提出了一种高效扩展依存句法特征的策略.该策略既提高了系统的正确率,又避免了传统特征扩展方法所带来的计算量增加的问题.实验结果表明,混合模型及特征扩展策略与传统方法相比,在算法准确性和效率上,都有显著的提高.In this paper,focusing on sentiment text classification,the performance of generative and discriminative models for sentiment classification was studied,and a hybrid approach to sentiment classification was proposed.The individual generative classifier(naive Bayes,(NB) and the discriminative classifier(support vector machines,SVM) were merged into a hybrid version in a two-stage process in order to overcome individual drawbacks and benefit from the merits of both systems.On the basis of the hybrid classifier,an efficient strategy of incorporating dependency features was also presented.The strategy not only increases the accuracy of the system,but also avoids the defects of increased computing volume brought by the traditional feature expansion method.Experimental results show the apparent advantages of this approach in both classification accuracy and efficiency.

关 键 词:文本分类 情感分类 混合模型 特征扩展 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象