检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:斯曲卓嘎 拥措 赛鸣宇 SI Qu-zhuo-ga;YONG Tso;SAI Ming-yu(School of Information Science and Technology,Tibet University,Lhasa 850000,China;Key Laboratory of Tibetan Information Technology and Artificial Intelligence in Tibet Autonomous Region,Lhasa 850000,China;Engineering Research Center of Tibetan Information Technology,Ministry of Education,Lhasa 850000,China)
机构地区:[1]西藏大学信息科学技术学院,西藏拉萨850000 [2]西藏自治区藏文信息技术人工智能重点实验室,西藏拉萨850000 [3]藏文信息技术教育部工程研究中心,西藏拉萨850000
出 处:《计算机技术与发展》2025年第2期130-137,共8页Computer Technology and Development
基 金:西藏自治区科技计划项目(XZ202401JD0010)。
摘 要:藏文情感三元组(方面词、情感词、情感极性)是细粒度情感分析的核心任务,对于深入理解藏族情感表达和趋势至关重要。但藏文的独特语言结构和文化背景导致其情感表达方式与其他语言不同,从而增加了细粒度情感分析的复杂性。为了提高藏文情感三元组的提取能力,该文提出了OpinionNet-OTE-MTL模型,该模型融合了词性信息、Word2Vec词向量和绝对位置向量,并通过双向长短时记忆网络(BILSTM)进行特征提取。其中,由于藏文词性种类较多,该文分析了大量的情感数据集并从中提取出11种词性辅助模型识别。最后,为了验证OpinionNet-OTE-MTL模型的有效性,在自构建的2000句藏文细粒度情感分析数据上进行了对比实验和消融实验。消融实验表明,词性较位置信息对模型的影响更大,其三元组抽取F1值提高了3.06百分点;对比实验结果表明将词性和位置特征融入进模型后,在情感三元组提取(Triple)任务上的精确率、召回率和F1值较基线实验提高了4.73百分点、6百分点、6.14百分点,融入词性和绝对位置信息使模型能更精确地理解藏文的语法结构和语义规则,从而提升了情感三元组分类任务的准确度。The extraction of Tibetan emotional triplets(aspect words,emotional words,emotional polarity)is a core task in fine-grained sentiment analysis,which is crucial for a deep understanding of Tibetan emotional expressions and trends.However,the unique language structure and cultural background of Tibetan make its emotional expression different from other languages,thereby increasing the complexity of fine-grained sentiment analysis.To enhance the capability of extracting Tibetan emotional triplets,we propose the OpinionNet-OTE-MTL model,which integrates part-of-speech information,Word2Vec word vectors,and absolute position vectors.Feature extraction is performed through Bidirectional Long Short-Term Memory networks(BiLSTM).Given the diverse types of parts of speech in Tibetan,we analyze a large amount of emotional datasets and extract 11 types of parts of speech to assist in model recognition.Finally,to validate the effectiveness of the OpinionNet-OTE-MTL model,comparative experiments and ablation experiments were conducted on a self-constructed dataset of 2000 Tibetan sentences for fine-grained sentiment analysis.The ablation experiments indicated that part-of-speech information had a greater influence on the model than positional information,resulting in a 3.06 percentage points increase in the F1 score of triplet extraction.Comparative experimental results showed that after integrating part-of-speech and positional features into the model,the precision,recall,and F1 score of the emotional triplet extraction task increased by 4.73 percentage points,6 percentage points,and 6.14 percentage points respectively compared to the baseline experiment.Integrating part-of-speech and absolute positional information enables the model to better understand the grammatical structure and semantic rules of Tibetan,thereby improving the accuracy of emotional triplet classification tasks.
关 键 词:藏文 Word2Vec 词性 位置特征 情感三元组
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.15.27.146