基于模态分解学习的语音表情情感识别算法  

Audio Expression Emotion Recognition Algorithm Using Modal Decomposition Learning

在线阅读下载全文

作  者:焦爽 陈光辉 JIAO Shuang;CHEN Guang-hui(Central South Electric Power Test Research Institute,China Datang Group Science and Technology Research Institute Co.,Ltd.,Zhengzhou Henan 450000,China;National Mobile Communications Research Laboratory,School of Information Science and Engineering,Southeast University,Nanjing Jiangsu 210096,China)

机构地区:[1]中国大唐集团科学技术研究院有限公司中南电力试验研究院,河南郑州450000 [2]东南大学信息科学与工程学院移动通信国家重点实验室,江苏南京210096

出  处:《计算机仿真》2025年第1期182-187,共6页Computer Simulation

基  金:国家自然科学基金(U1504622)。

摘  要:随着人工智能的快速发展,情感识别作为人机交互的核心单元也引发广泛关注。以前工作要么研究语音和表情模态之间的相关性,要么设计复杂的融合策略,而忽略语音和表情模态之间异构性导致的分布差距和信息冗余,严重影响语音表情双模态情感识别的性能。为解决上述问题,提出一种基于模态分解学习的语音表情双模态情感识别算法。具体地,为降低语音和表情模态之间的分布差距,该算法首先提出一种改进的对比学习来获取语音表情模态的共享特征;然后,该算法也分别获取语音和表情模态的特有特征,并利用正交约束来降低这些特征之间的冗余;其次,语音和表情模态的共享和特有特征被融合来实现情感识别;最后,在三个公开的情感数据集上的实验结果也表明该算法的识别率高于其它基准算法。With the rapid development of artificial intelligence,emotion recognition as a core unit of human-computer interaction has also attracted widespread attention.Previous works have either studied the correlation between audio and expression modalities or designed complex fusion strategies,while ignoring the distribution gap and information redundancy caused by the heterogeneity between audio and expression modalities,which seriously affects the performance of audio-expression bimodal emotion recognition.To solve this problem,this paper proposes an audio-expression bimodal emotion recognition algorithm based on modal decomposition learning.Specifically,to reduce the distribution gap between audio and expression modalities,an improved contrast learning approach is first proposed to obtain the shared features of audio-expression modalities.Subsequently,this algorithm also separately obtains the specific features of audio and expression modalities,and utilizes orthogonal constraints to reduce the redundancy between these features.Next,the shared and specific features of audio and expression modalities are fused to achieve emotion recognition.Finally,the experimental results on three public emotion datasets also show that this algorithm achieves higher recognition rates than other benchmark algorithms.

关 键 词:情感识别 模态分解 深度学习 语音表情 

分 类 号:TP391.9[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象