基于多头注意力机制的模型层融合维度情感识别方法被引量：10

Model Level Fusion Dimension Emotion Recognition Method Based on Transformer

作　　者：董永峰苏海洋刘斌[2] 陶建华 DONG Yongfeng;SU Haiyang;LIU Bin;TAO Jianhua(School of Artificial Intelligence,Hebei University of Technology,Tianjin 300401,China;National Laboratory of Pattern Recognition,Institute of Automation,Chinese Academy of Sciences,Beijing 100190,China)

机构地区：[1]河北工业大学人工智能与数据科学学院,天津300401 [2]中国科学院自动化研究所模式识别实验室,北京100190

出　　处：《信号处理》2021年第5期885-892,共8页Journal of Signal Processing

基　　金：国家重点研发计划(2017YFB1002804);国家自然科学基金重点项目(61831022,61771472,61901473,61902106);天津市自然科学基金(19JCZDJC40000);河北省自然科学基金(F2020202028)。

摘　　要：近年来,情感识别成为了人机交互领域的研究热点问题,而多模态维度情感识别能够检测出细微情感变化,得到了越来越多的关注多模态维度情感识别中需要考虑如何进行不同模态情感信息的有效融合。针对特征层融合存在有效特征提取和模态同步的问题、决策层融合存在不同模态特征信息的关联问题,本文采用模型层融合策略,提出了基于多头注意力机制的多模态维度情感识别方法,分别构建音频模型、视频模型和多模态融合模型对信息流进行深层特征学习,最后放入双向长短时网络中得到最终情感预测值。所提方法相比于不同基线方法在激活度和愉悦度上均取得了最佳的性能,可以在高层维度对情感信息有效捕捉,进而更好的对音视频信息进行有效融合。In recent years,emotion recognition had become a hot research topic in the field of human-computer interaction,and multi-modal dimensional emotion recognition could detect subtle emotional changes,which had attracted more and more attention.In multi-modal emotion recognition,it was necessary to consider how to effectively integrate different modal emotion information.Aiming at the problem of effective feature extraction and modal synchronization in feature level fusion,and the correlation problem of different modal feature information in decision level fusion,this paper adopted a model level fusion strategy and proposes a multi-modal dimension emotion recognition method based on Transformer.Respectively constructed audio model,video model and multi-modal fusion model to learn the deep features of the information flow,and finally put it into Bi-directional Long Short Term Memory to obtain the final emotional prediction value.Compared with different baseline methods,the proposed method achieves the best performance in terms of arousal and valence,and could effectively capture emotional information in high-level dimensions,and thus better effectively integrate audio and video information.

关键词：维度情感识别多模态情感融合模型层融合多头注意力机制

分类号：TP391.4[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于多头注意力机制的模型层融合维度情感识别方法被引量：10

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于多头注意力机制的模型层融合维度情感识别方法 被引量：10

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于多头注意力机制的模型层融合维度情感识别方法被引量：10