基于情绪基调的音视频双模态情绪识别算法  被引量:2

AUDIO-VISUAL BIMODAL EMOTION RECOGNITION BASED ON EMOTIONAL TONE

在线阅读下载全文

作  者:卫飞高 张树东[1,2] 付晓慧 Wei Feigao1,2,Zhang Shudong1,2,Fu Xiaohui1(1.College of Information Engineering, Capital Normal University, Beijing 100048,China;2.Beijing Advanced Innovation Center for Imaging Technology, Beijing 100048, Chin)

机构地区:[1]首都师范大学信息工程学院,北京100048 [2]成像技术北京市高精尖创新中心,北京100048

出  处:《计算机应用与软件》2018年第8期238-242,共5页Computer Applications and Software

基  金:国家重点研发计划项目(2017YFB1400800)

摘  要:在音视频决策层融合过程中,当单模态间的情绪识别结果不一致时,融合后的识别结果不准确。针对这类问题,提出一种基于情绪基调的音视频双模态情绪识别算法。采用连续混合高斯分布的隐马尔科夫模型(GMM-HMM)和随机森林(RF)分别对音频和视频进行情绪识别;基于情绪基调对音频和视频的情绪识别结果进行修正;在不同情绪基调下运用线性相关性分析方法进行决策层融合。实验结果表明:将该算法应用到SEMAINE数据库上,情绪识别的均方根误差(RMSE)得到降低,皮尔逊相关系数(PCC)得到了提高。In the process of audio-visual decision-level fusion, the inconsistent results of audio and visual emotion recognition lead to inaccurate fusion results. To solve this problem, the algorithm of audio-visual bimodal emotion recognition based on emotional tone was proposed. The Gaussian Mixture Model and Hidden Markov Model(GMM-HMM) and the Random Forest(RF) were adopted to recognize auditory and visual emotion respectively. Then the emotional tone was introduced to modify the audio and visual emotion recognition results. The linear correlation analysis method was used to fuse the results of audio-visual emotion recognition. By applying the proposed methods to the SEMAINE database, the Root-Mean-Squared-Error(RMSE) of emotion recognition is decreased, and Pearson Correlation Coefficient(PCC) is improved.

关 键 词:情绪基调 音视频 双模态 决策层融合 情绪识别 

分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象