检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:乔栋 陈章进[1,2] 邓良 屠程力 QIAO Dong;CHEN Zhangjin;DENG Liang;TU Chengli(Microelectronics Research and Development Center,Shanghai University,Shanghai 200444,China;Computing Centre,Shanghai University,Shanghai 200444,China)
机构地区:[1]上海大学微电子研究与开发中心,上海200444 [2]上海大学计算中心,上海200444
出 处:《计算机工程》2022年第2期281-290,共10页Computer Engineering
基 金:国家自然科学基金(61674100)。
摘 要:语音情感识别在人机交互中具有重要意义。为解决中文语音情感识别效率和准确率低等问题,提出一种基于Trumpet-6卷积神经网络模型的中文语音情感识别方法。在MFCC特征提取过程中,通过增加分帧加窗操作时采样点的个数,增添每个汉明窗内的特征及减少汉明窗个数,从而缩小MFCC特征图的像素尺寸,提高单次识别的处理效率。在此基础上,使用高斯白噪声对数据集进行数据增强处理,缓解训练过程中的过拟合问题。在CASIA语音情感数据集上的实验结果表明,该方法的测试准确率达95.7%,优于Lenet-5、RNN、LSTM等传统方法,且Trumpet-6卷积神经网络模型采用2048个采样点,仅需176550个待训练参数,与采用DCNN的ResNet34和循环神经网络模型相比,参数更少,结构更简单,处理速度更快。Speech emotion recognition is essential in human-computer interaction.In this study,a Chinese speech emotion recognition method based on the Trumpt-6 convolutional neural network model was developed to solve the problem of low efficiency and accuracy of Chinese speech emotion recognition.During the process of extracting the Mel Frequency Cepstral Coefficient(MFCC)feature,the pixel size of the MFCC feature map was reduced to improve the processing efficiency of single recognition.This was achieved by increasing the number of sampling points in the frame windowing operation,adding the features in each Hamming window,and reducing the number of Hamming windows.Gaussian white noise was used to enhance the data set to minimize overfitting during the training process.The experimental results for the CASIA speech emotion data set show that the test accuracy of this method is 95.7%,which is better than those of traditional methods,such as Lenet-5,Recurrent Neural Network(RNN),and Long Short-Term Memory(LSTM).The Trump-6 convolutional neural network model uses 2048 sampling points and only 176550 parameters for training.This method has fewer parameters,a simpler structure,and faster processing than ResNet34 and the cyclic neural network model using deep convolutional neural networks.
关 键 词:语音情感识别 MFCC特征 高斯白噪声 数据增强 卷积神经网络
分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7