基于情感词典和标注语料库的乌兹别克语短文本情感分析  被引量:2

Emotional Analysis of Uzbek Short Text Based on Emotional Dictionary and Annotated Corpus

在线阅读下载全文

作  者:原伟 YUAN Wei(Luoyang Campus,Strategic Support Force Information Engineering University,Luoyang 471003,China)

机构地区:[1]战略支援部队信息工程大学洛阳校区,河南洛阳471003

出  处:《中央民族大学学报(自然科学版)》2022年第2期5-12,共8页Journal of Minzu University of China(Natural Sciences Edition)

基  金:国家社科基金重大项目(20&ZD120);国家社科基金重点项目(20AZD130);河南省哲学社会科学规划项目(2021BYY024)。

摘  要:本文以中亚跨境民族语言乌兹别克语为研究对象,搭建了包含形容词、名词、动词、程度及否定副词、否定词、转折及递进连接词、复杂短语在内的情感词典(共6 451条);设计了情感标注体系,对包含6 000条网络评论的语料库进行情感类别、表达手段和情感倾向等信息的人工标注。针对乌兹别克语普通句、副词修饰句、非动词否定句、双重否定句、转折句、递进句设计了情感分析算法,使用情感语料库和软件应用网评作为测试集开展了短文本情感分析实验。结果表明了情感词典、情感语料库和情感分析算法的有效性,但也暴露出情感词典在数据规模、覆盖面、精细度以及文本预加工手段方面的弱点和不足。Based on the Uzbek language of the cross-border national language in Central Asia, the Emotional Dictionary(including 6 451 words) including adjectives, nouns, verbs, degrees and negative adverbs, negative words, transitions and progressive conjunctions, and complex phrases was built. The emotional labeling system was designed. The corpus containing 6 000 online reviews was manually labelled with information such as sentiment categories, expressions and sentiment orientations. For Uzbek common sentences, adverb modifiers, non-verb negative sentences, double negative sentences, and transition sentences The sentiment analysis algorithm designed by the progressive sentence, using the emotional corpus and software application online review as a test set to carry out a short text sentiment analysis experiment. The results prove the effectiveness of the sentiment dictionary, the emotional corpus and the sentiment analysis algorithm, but also expose the weaknesses and inadequacies of the emotional dictionary in terms of data size, coverage, fineness and text preprocessing.

关 键 词:乌兹别克语 情感分析 情感词典 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象