维吾尔文情感分类特征建设研究  被引量:1

Research on feature construction of Uyghur text sentiment classification

在线阅读下载全文

作  者:热西旦木·吐尔洪太 吾守尔·斯拉木[1] Raxida Turhuntay;Wushour Slamu(College of Information Science&Engineering,Xinjiang University,Urumqi 830046,China;College of Electronic&Information Engineering,Yili Normal University,Yili Xinjiang 835000,China)

机构地区:[1]新疆大学信息科学与工程学院,乌鲁木齐830046 [2]伊犁师范学院电子与信息工程学院,新疆伊宁835000

出  处:《计算机应用研究》2019年第12期3548-3552,共5页Application Research of Computers

基  金:国家“973”计划资助项目(2014CB340506);国家自然科学基金资助项目(61363063)

摘  要:由于目前缺乏维吾尔文情感分类特征表示方面的系统性研究,以传统n-gram特征为基础,按不同规模从维吾尔文情感标注语料库中提取了新特征及其组合特征,基于支持向量机(SVM)分类器对维吾尔文情感语料库进行了正负情感分类。实验结果表明,所提取的基本特征中unigram特征的分类效率最佳;unigram特征与词组特征的组合可以进一步提高分类效率,其最佳分类效果比unigram特征的分类效果提高了1. 78%。首次在统一标注数据集上对不同特征的分类性能进行了综合评价,研究成果可以为今后的维吾尔文情感分类研究提供指导。Due to the lack of systematic research on the feature expression of Uyghur text sentiment classification,this paper used the traditional n-gram features as the basis to extract new features and combined features from Uyghur sentiment corpora on different scales,and used support vector machine( SVM) classifier to classify the corpora as positive and negative. Results indicated that,in the Uyghur text sentiment classification,the unigram features in the basic features have the best classification efficiency. The combination of unigram features and phrase features can further improve the classification efficiency. The best performance of the combined features,the classification accuracy is 1. 78% higher than that of unigram. This paper first made a comprehensive evaluation of the classification performance of different features on a unified data set. The research results can be applied as a reference for future Uyghur sentiment classification research.

关 键 词:情感分类 特征建设 组合特征 维吾尔文 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象