检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:杜晨阳 张雪英[1] 黄丽霞[1] 李娟[1] DU Chenyang;ZHANG Xueying;HUANG Lixia;LI Juan(College of Electronic Information and Optical Engineering,Taiyuan University of Technology,Taiyuan 030024,Shanxi,China)
机构地区:[1]太原理工大学电子信息与光学工程学院,山西太原030024
出 处:《计算机工程》2025年第4期97-106,共10页Computer Engineering
基 金:国家自然科学基金(62271342)。
摘 要:注意力机制已经广泛地用于语音情感识别(SER)领域,但是传统注意力模块在提升模型性能表现的同时也会大幅增加模型的参数量。高效通道注意力(ECA)机制虽然参数量较小,但是只能对通道维度生成注意力权重。针对这个问题,提出一种改进ECA(IECA)模块,该模块以较小的参数量对输入的特征图的各个维度生成对应的权重,使得模型更关注和利用特征图中的重要信息。此外,为了进一步提升识别率,分别提取语音的语谱图特征和IS10特征,通过融合网络对不同支路的预测结果进行决策融合,得到最终的预测结果。所提出的模型在EMODB和CASIA两个语音情感数据集上分别取得了91.63%、92.46%的加权准确率(WA)和91.25%、92.33%的未加权平均召回率(UAR),相较之前的研究结果分别有2.69~8.43和4.16~10.69百分点的提升。The attention mechanism has been widely employed in the field of Speech Emotion Recognition(SER).However,traditional attention modules,while enhancing model performance,also significantly increase the model parameter count.Although the Efficient Channel Attention(ECA)mechanism has a small number of parameters,it can only generate attention weights for the channel dimension.In response to this challenge,an Improved ECA(IECA)module is proposed.IECA module generates corresponding weights for various dimensions of input feature maps with a relatively small number of parameters,enabling the model to more effectively focus on and utilize crucial information within the feature maps.Additionally,to further enhance recognition rates,spectrogram and ISio features are separately extracted from the speech data.Employing a fusion network,predictions from different branches are combined to yield the final prediction.The proposed model obtained Weighted Accuracy(WA)of 91.63%and 92.46%and Unweighted Average Recall(UAR)of 91.25%and 92.33%on EMODB and CASIA datasets,respectively,which are higher by 2.69-8.43 percentage points and 4.16-10.69 percentage points,respectively,than those reported in previous research.
关 键 词:深度学习 语音情感识别 注意力机制 多特征融合 决策级融合
分 类 号:TP183[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7