检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李文杰[1] 罗文俊 李艺文 苏成悦[2] 陈玉怀 曹越 LI Wen-jie;LUO Wen-jun;LI Yi-wen;SU Cheng-yue;CHEN Yu-huai;CAO Yue(School of Information Engineering,Guangdong University of Technology,Guangzhou 510006,China;School of Physics and Optoelectronic Engineering,Guangdong University of Technology,Guangzhou 510006,China)
机构地区:[1]广东工业大学信息工程学院,广州510006 [2]广东工业大学物理与光电工程学院,广州510006
出 处:《信息技术》2020年第10期61-66,共6页Information Technology
基 金:中山市重大科技专项(2016A1003)。
摘 要:语音情感识别是人机交互领域的一个研究热点。针对普通卷积神经网络参数量过大和不能较好地处理时序信息的问题,文中给出将可分离卷积与LSTM应用于语音情感识别的方法,在RAVDESS情感语料库上进行了验证,利用MFCC特征训练的1D Sep-CNN-LSTM模型获得了90.77%的识别准确率,模型压缩了约40%。利用语谱图特征训练的2D Sep-CNN-LSTM模型获得了82.21%的识别准确率,模型压缩了约75%。实验表明,该方法相较其他模型在语音情感识别应用上有一定的优越性,适合于实时下位机的应用。Speech emotion recognition is a research hotspot in the field of human-computer interaction.Aiming at the problem that the parameter volume of ordinary convolutional neural networks is too large and cannot deal with time series information well,a method of applying separable convolution and LSTM to speech emotion recognition is proposed in this paper,which is verified on the RAVDESS database.The feature-trained 1D Sep-CNN-LSTM model achieved 90.77%recognition accuracy,and the model was compressed by about 40%.The 2D Sep-CNN-LSTM model trained using the features of the spectrogram obtained a recognition accuracy of 82.21%,and the model is compressed by about 75%.Experiments show that this method is superior to other models in speech emotion recognition applications,and it is suitable for real-time lower computer applications.
关 键 词:语音情感识别 可分离卷积 LSTM MFCC 语谱图
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.141.12.150