检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张卫 贾宇 张雪英 ZHANG Wei;JIA Yu;ZHANG Xue-Ying(College of Information,Shanxi University of Finance and Economics,Taiyuan Shanxi 030006,China;College of Information and Computer,Taiyuan University of Technology,Taiyuan Shanxi 030024,China)
机构地区:[1]山西财经大学信息学院,山西太原030006 [2]太原理工大学信息与计算机学院,山西太原030024
出 处:《计算机仿真》2022年第11期258-262,共5页Computer Simulation
基 金:国家青年科学基金项目(61902226);山西省青年科技研究基金(201901D211415);山西省高等学校科技创新项目(2019L0498);山西财经大学青年科研基金项目(QN-2019017)。
摘 要:针对混合语音情感识别中,传统识别方法不能充分考虑语种之间的差异性,导致分类准确率偏低的问题,提出了自编码器(autoencoder)与长短时记忆(Long Short Term Memory,LSTM)模型相结合的方法,通过提取MFCC,MEL Spectrogram Frequency,Chroma三种特征获得180维特征。并利用自编码器获取一个更高维度、更深层次的500维特征,通过LSTM进行建模,提高语音情感分类的准确性。使用德语EMO-DB和中文CASIA语音库进行分类实验,研究表明,自编码器提取出的深度特征更适合混合语音情感分类。较传统分类方法,使用自编码器+LSTM进行分类,最优识别结果可提升7.5%。In mixed speech emotion recognition,traditional recognition methods can not fully consider the differences between languages,which leads to low classification accuracy.A method combining auto encoder with Long Short Term Memory(LSTM)model is proposed.This method obtains 180 dimensional features by extracting MFCC,MEL Spectrum Frequency and Chroma features.In addition,the method uses autoencoder to obtain a higher dimension and deeper level 500-dimension features,as well as to improve the accuracy of speech emotion classification by modeling through the LSTM.The classification experiments were carried out on German EMO-DB and Chinese CASIA database.The result shows that,the depth features extracted from the autoencoder is more suitable for speech emotion classification.Compared with the traditional classification method,the optimal recognition result can be increased by 7.5%by using Autoencoder-LSTM.
分 类 号:TP391.9[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.17.164.48