检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李巧君[1] 郭彍[2] Li Qiaojun;Guo Guo(School of Electronic Information Engineering,Henan Polytechnic Institute,Nanyang 473000,Henan,China;College of Electronic Science and Engineering,University of Electronic Science and Technology of China,Chengdu 610054,Sichuan,China)
机构地区:[1]河南工业职业技术学院电子信息工程学院,河南南阳473000 [2]电子科技大学电子科学与工程学院,四川成都610054
出 处:《计算机应用与软件》2024年第9期224-229,共6页Computer Applications and Software
基 金:河南省高等学校重点科研项目(19A520022);河南省高等职业学校青年骨干教师培养计划项目(教职成函[2019]326号)。
摘 要:针对当前语音情感识别(Speech Emotion Recognition, SER)方法中准确性低和时间复杂度高的问题,提出一种基于改进K均值聚类的语音情感识别深度学习方法。采用改进的K-均值聚类算法从整个音频信号中选取反映情感特征的关键片段;使用短时傅里叶变换将所选序列转化为一个谱图;利用深度残差模型ResNet和深度双向长短时记忆Bi-LSTM网络从空间和时间上学习表征谱图中与情感相关的隐藏特征,基于Softmax分类器获得最终的情感分类。实验结果表明,所提方法比其他识别方法具有明显的优势,在改善情感识别率的同时,降低了模型的处理时间。Aimed at the problems of low accuracy and high time complexity in current speech emotion recognition(SRE)methods,a deep learning method for speech emotion recognition based on the improved k-mean clustering is proposed.The improved k-mean clustering algorithm was used to select the key segments which reflected the emotional features from the whole audio signal.The selected sequence was transformed into a spectrum by using short-time Fourier transform.The deep residual model ResNet and deep Bi-LSTM network were used to learn the hidden features related to emotion in the representation spectrum from space and time.The final sentiment classification was obtained based on Softmax classifier.Experimental results show that the proposed method has obvious advantages over other recognition methods,which improves the emotion recognition rate and reduces the processing time of the model.
关 键 词:语音情感识别 深度双向长短时记忆 K-均值聚类 短时傅里叶变换
分 类 号:TP393[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.118.253.134