检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:热西旦木·吐尔洪太 吾守尔·斯拉木[1] Raxida Turhuntay;Wushour Slamu(College of Information Science&Engineering,Xinjiang University,Urumqi 830046,China;College of Electronic&Information Engineering,Yili Normal University,Yili Xinjiang 835000,China)
机构地区:[1]新疆大学信息科学与工程学院,乌鲁木齐830046 [2]伊犁师范学院电子与信息工程学院,新疆伊宁835000
出 处:《计算机应用研究》2019年第12期3548-3552,共5页Application Research of Computers
基 金:国家“973”计划资助项目(2014CB340506);国家自然科学基金资助项目(61363063)
摘 要:由于目前缺乏维吾尔文情感分类特征表示方面的系统性研究,以传统n-gram特征为基础,按不同规模从维吾尔文情感标注语料库中提取了新特征及其组合特征,基于支持向量机(SVM)分类器对维吾尔文情感语料库进行了正负情感分类。实验结果表明,所提取的基本特征中unigram特征的分类效率最佳;unigram特征与词组特征的组合可以进一步提高分类效率,其最佳分类效果比unigram特征的分类效果提高了1. 78%。首次在统一标注数据集上对不同特征的分类性能进行了综合评价,研究成果可以为今后的维吾尔文情感分类研究提供指导。Due to the lack of systematic research on the feature expression of Uyghur text sentiment classification,this paper used the traditional n-gram features as the basis to extract new features and combined features from Uyghur sentiment corpora on different scales,and used support vector machine( SVM) classifier to classify the corpora as positive and negative. Results indicated that,in the Uyghur text sentiment classification,the unigram features in the basic features have the best classification efficiency. The combination of unigram features and phrase features can further improve the classification efficiency. The best performance of the combined features,the classification accuracy is 1. 78% higher than that of unigram. This paper first made a comprehensive evaluation of the classification performance of different features on a unified data set. The research results can be applied as a reference for future Uyghur sentiment classification research.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15