检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:汪洋[1] 傅洪亮 陶华伟 杨静[1] 谢跃 赵力[3] WANG Yang;FU Hongliang;TAO Huawei;YANG Jing;XIE Yue;ZHAO Li(Key Laboratory of Grain Information Processing and Control,Ministry of Education(Henan University of Technology),Zhengzhou Henan 450001,China;School of Information and Communication Engineering,Nanjing Institute of Technology,Nanjing Jiangsu 211167,China;School of Information Science and Engineering,Southeast University,Nanjing Jiangsu 210096,China)
机构地区:[1]粮食信息处理与控制教育部重点实验室(河南工业大学),郑州450001 [2]南京工程学院信息与通信工程学院,南京211167 [3]东南大学信息科学与工程学院,南京210096
出 处:《计算机应用》2023年第2期374-379,共6页journal of Computer Applications
基 金:国家自然科学基金资助项目(62001215);河南省教育厅自然科学项目(21A120003,22A520004,22A510001);河南工业大学高层次人才启动项目(2018BS037)。
摘 要:域自适应算法被广泛应用于跨库语音情感识别中;然而,许多域自适应算法在追求减小域差异的同时,丧失了目标域样本的鉴别性,导致其以高密度的形式存在于模型决策边界处,降低了模型的性能。基于此,提出一种基于决策边界优化域自适应(DBODA)的跨库语音情感识别方法。首先利用卷积神经网络进行特征处理,随后将特征送入最大化核范数及均值差异(MNMD)模块,在减小域间差异的同时,最大化目标域情感预测概率矩阵的核范数,从而提升目标域样本的鉴别性并优化决策边界。在以Berlin、eNTERFACE和CASIA语音库为基准库设立的六组跨库实验中,所提方法的平均识别精度领先于其他算法1.68~11.01个百分点,说明所提模型有效降低了决策边界的样本密度,提升了预测的准确性。Domain adaptation algorithms are widely used for cross-corpus speech emotion recognition.However,many domain adaptation algorithms lose the discrimination of target domain samples while pursuing the minimization of domain discrepancy,resulting in their presence at the decision boundary of the model in a high-density form,which degrades the performance of the model.Based on the above problem,a Decision Boundary Optimized Domain Adaptation(DBODA)method based cross-corpus speech emotion recognition was proposed.Firstly,the features were processed by using convolutional neural networks.Then,the features were fed into the Maximum Nuclear-norm and Mean Discrepancy(MNMD)module to maximize the nuclear norm of the sentiment prediction probability matrix of the target domain while reducing the inter-domain discrepancy,thereby enhancing the discrimination of the target domain samples and optimize the decision boundary.In six sets of cross-corpus experiments set up on the basis of Berlin,eNTERFACE and CASIA speech databases,the average recognition accuracy of the proposed method is 1.68 to 11.01 percentage points ahead of those of the other algorithms,indicating that the proposed model effectively reduces the sample density around the decision boundary and improves the prediction accuracy.
关 键 词:跨库语音情感识别 卷积神经网络 决策边界优化 域自适应 特征分布差异
分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.26