基于胶囊网络的方面级跨领域情感分析  

Aspect-level cross-domain sentiment analysis based on capsule network

在线阅读下载全文

作  者:孟佳娜 吕品 于玉海 孙世昶 林鸿飞[2] MENG Jiana;LYU Pin;YU Yuhai;SUN Shichang;LIN Hongfei(College of Computer Science and Engineering,Dalian Minzu University,Dalian Liaoning 116600,China;School of Computer Science and Technology,Dalian University of Technology,Dalian Liaoning 116024,China)

机构地区:[1]大连民族大学计算机科学与工程学院,辽宁大连116600 [2]大连理工大学计算机科学与技术学院,辽宁大连116024

出  处:《计算机应用》2022年第12期3700-3707,共8页journal of Computer Applications

基  金:国家自然科学基金资助项目(61876031);辽宁省教育厅2019年度科学研究经费项目(LJYT201906)。

摘  要:在跨领域情感分析任务中,目标领域带标签样本严重不足,并且不同领域间的特征分布差异较大,特征所表达的情感极性也有很大差别,这些问题都导致了分类准确率较低。针对以上问题,提出一种基于胶囊网络的方面级跨领域情感分析方法。首先,通过BERT预训练模型获取文本的特征表示;其次,针对细粒度的方面级情感特征,采用循环神经网络(RNN)将上下文特征与方面特征进行融合;然后,使用胶囊网络配合动态路由来区分重叠特征,并构建基于胶囊网络的情感分类模型;最后,利用目标领域的少量数据对模型进行微调来实现跨领域迁移学习。所提方法在中文数据集上的最优的F1值达到95.7%,英文数据集上的最优的F1值达到了91.8%,有效解决了训练样本不足造成的准确率低的问题。In the cross-domain sentiment analysis,the labeled samples in the target domain are seriously insufficient,the distributions of features in different domains are very different,and the emotional polarities expressed by features in one domain differ a lot from the emotional polarities in another domain,all of these problems lead to low classification accuracy.To deal with the above problems,an aspect-level cross-domain sentiment analysis method based on capsule network was proposed.Firstly,the feature representations of text were obtained by BERT(Bidirectional Encoder Representation from Transformers)pre-training model.Secondly,for the fine-grained aspect-level sentiment features,Recurrent Neural Network(RNN)was used to fuse the context features and aspect features.Thirdly,capsule network and dynamic routing were used to distinguish overlapping features,and the sentiment classification model was constructed on the basis of capsule network.Finally,a small amount of data in the target domain was used to fine-tune the model to realize cross-domain transfer learning.The optimal F1 score of the proposed method is 95.7%on Chinese dataset and 91.8%on English dataset,which effectively solves the low accuracy problem of insufficient training samples.

关 键 词:方面级情感分析 跨领域 胶囊网络 循环神经网络 预训练 

分 类 号:TP389.1[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象