基于多视角的图像文本情感分析  

Sentiment Analysis of Image-Text Based on Multiple Perspectives

在线阅读下载全文

作  者:高玮军[1] 孙子博 刘书君 GAO Weijun;SUN Zibi;LIU Shujun(School of Computer and Communication,Lanzhou University of Technology,Lanzhou 730000,China)

机构地区:[1]兰州理工大学计算机与通信学院,兰州730000

出  处:《计算机科学》2024年第S02期128-135,共8页Computer Science

基  金:国家自然科学基金(51668043)。

摘  要:在社交媒体中,人们往往首先被图片中的人物表情所吸引,直接触及到情感。然而,对于情感的完整表达,场景也扮演着不可或缺的角色,为情感分析提供了必要的背景和支持。但许多学者忽视了场景在情感表达中的重要性,导致结果并非最优。针对图文双模态情感分析模型存在忽略多模态间的对齐、图片特征提取不充分和模型泛化能力不高的问题,提出了多视角图像文本情感分析网络(Multi-view Image-Text Emotion Analysis Network Model,MITN)。在图像特征提取中,在面部表情方面加入注意力机制来更好地捕捉人物面部表情,在场景方面加入空洞卷积引入膨胀率来增大感受野,并利用Places数据集对Scene-VGG进行迁移学习训练,以此来充分利用场景。使用BERT+BiGRU来提取文本表达特征,在多模态情感数据集MVSA上的实验验证了所提模型的有效性。In the realm of social media,facial expressions of characters in pictures often captivate our attention first,directly evoking strong emotional responses.However,for a truly comprehensive emotional expression,scenes play a pivotal role,serving as a crucial backdrop and support for emotional analysis.Scenes provide context,setting the tone and atmosphere for the emotions being expressed.Regrettably,numerous scholars have failed to fully recognize the significance of scenes in emotional expression,often focusing solely on facial expressions.This oversight has led to suboptimal outcomes in sentiment analysis,missing out on the rich emotional nuances that scenes can provide.To address these challenges,we propose the multi-view image text sentiment analysis network(MITN).This innovative approach takes into account both facial expressions and scenes,providing a more comprehensive analysis of emotional expression.In MITN,we enhance image feature extraction by incorporating an attention mechanism that meticulously captures the facial expressions of characters.At the same time,dilated convolution is introduced to broa-den the receptive field,focusing on the intricate details of the scene.Moreover,we leverage the Places dataset for transfer learning training of Scene-VGG.This allows us to fully utilize the vast amount of scene information available,enhancing the accuracy and depth of our emotional analysis.The effectiveness of MITN is rigorously tested through experiments on the multimodal sentiment dataset MVSA.Utilizing BERT+BiGRU to extract text expression features,our model demonstrates superior performance in sentiment analysis,accurately capturing the emotional nuances present in both facial expressions and scenes.This comprehensive approach offers a new perspective in sentiment analysis,paving the way for more accurate and nuanced understanding of emotio-nal expression in social media.

关 键 词:多模态 情感分析 多视角 迁移学习 注意力机制 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象