基于ELMo和Bi-SAN的中文文本情感分析  被引量:12

Chinese text sentiment analysis based on ELMo and Bi-SAN

在线阅读下载全文

作  者:李铮 陈莉[1] 张爽 Li Zheng;Chen Li;Zhang Shuang(School of Information Science&Technology,Northwest University,Xi’an 710127,China)

机构地区:[1]西北大学信息科学与技术学院,西安710127

出  处:《计算机应用研究》2021年第8期2303-2307,共5页Application Research of Computers

基  金:国家重点研发资助项目(2020YFC1523301);陕西省重点研发计划资助项目(2019ZDLGY10-01)。

摘  要:目前情感分析模型通常使用word2vec、GloVe等方法生成静态词向量,并且传统的卷积或循环深度模型无法完整地关注上下文,提取特征不充分,影响情感判断。针对上述问题,提出基于ELMo(embedding from language model)和双向自注意力网络(bidirectional self-attention network,Bi-SAN)的中文文本情感分析模型。首先通过ELMo语言模型训练得到融合词语本身和上下文信息的词向量,解决了一词多义的问题;同时使用预训练的skip-gram算法代替随机初始化的ELMo模型的嵌入层,提高模型的收敛速度;之后使用Bi-SAN提取特征,由于自注意力机制,Bi-SAN可以完整地关注每个词的上下文,提取特征更为全面。同现有的多个情感分析模型对比,该模型在酒店评论数据集上和NLPCC2014 task2中文数据集取得了更高的F 1值,验证了模型的有效性。Current sentiment analysis models usually use word2vec,GloVe and other methods to generate static word embedding,and traditional convolutional or recurrent depth models cannot fully focus on the context,extract insufficiently features,and reduce the accuracy of sentiment judgment.This paper proposed a Chinese text sentiment analysis model based on ELMo and Bi-SAN.Firstly,through ELMo language model training,the model got the word vector that integrated the word itself and context information to solve the problem of ambiguity of a word.Meanwhile,it used pre-trained skip-gram algorithm to replace the embedding layer of the randomly initialized ELMo model and improved the convergence speed of the model.Then the mo-del used Bi-SAN to extract features.Due to the self-attention mechanism,Bi-SAN could fully focus on the context of each word and extract features more comprehensively.Compared with multiple existing sentiment analysis models,the proposed model achieves higher F 1 in the hotel review dataset and the NLPCC2014 task2 Chinese dataset,which validates the effectiveness of the model.

关 键 词:情感分析 词向量 ELMo 自注意力机制 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象