基于多模态交叉互动的情感识别算法  

Emotion Recognition Algorithm Based on Multimodal Cross-Interaction

在线阅读下载全文

作  者:张慧 李菲菲[1] ZHANG Hui;LI Feifei(School of Optical-Electrical and Computer Engineering,University of Shanghai for Science and Technology,Shanghai 200093,China)

机构地区:[1]上海理工大学光电信息与计算机工程学院,上海200093

出  处:《电子科技》2024年第10期81-87,共7页Electronic Science and Technology

基  金:上海市高校特聘教授(东方学者)岗位计划(ES2015XX)。

摘  要:由于单模态情感识别的局限性,研究者已将其研究重点转移到多模态情感识别领域。多模态情感识别围绕最优提取每个模态的特征以及有效融合所提取出的特征这两方面问题进行研究。文中提出了一种基于多模态交叉互动的情感识别方法,以捕获模态表达的多样性。各种模态的编辑器分别提取具有情感信息的特征,模态间注意力机制堆叠的交互模块建模视觉-文本-音频之间的潜在关系。在基于文本、语音和图像的CMU-MOSI和CMU-MOSEI情感识别数据集上进行实验,结果显示在Acc2(Accuracy2)、Acc7(Accuracy7)、F1、MAE(Mean Absolute Error)和Corr(Correlation)这5个指标上文中方法分别取得了86.5%、47.7%、86.4%、0.718、0.776和83.4%、51.5%、83.4%、0.566、0.737的成绩,证明该方法性能具有显著提升,同时也验证了模态间交叉映射互相表示机制比各单模态表示方法具有更好的性能。Due to the limitations of single modality emotion recognition,many researchers have shifted their focus to the field of multimodal emotion recognition.Multi-modal emotion recognition focuses on two problems:The optimal extraction of the features of each mode and the effective fusion of the extracted features.This study proposes an emotion recognition method based on multimodal cross-interaction to capture the diversity of modality expressions.The editors of various modalities separately extract features with emotional information,and the stacked interaction modules based on the attention mechanism between modalities model the potential relationship among vision,text and audio.Experiments are conducted on CMU-MOSI and CMU-MOSEI datasets for emotion recognition based on text,audio and visual.The results show that the method achieved the scores of 86.5%,47.7%,86.4%,0.718,0.776,and 83.4%,51.5%,83.4%,0.566,0.737 on five indicators,Acc2(Accuracy2)、Acc7(Accuracy7)、F1、MAE(Mean Absolute Error)and Corr(Correlation).This demonstrates that the proposed algorithm significantly improves performance,and also validates that the cross-mapping mutual representation mechanism perform better than single-modal representation methods.

关 键 词:多模态 特征融合 情感识别 情感分析 注意力机制 变压器 变压器的双向编码器表示 交互映射 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象