The Efficacy of Deep Learning-Based Mixed Model for Speech Emotion Recognition 被引量：1

作　　者：Mohammad Amaz Uddin Mohammad Salah Uddin Chowdury Mayeen Uddin Khandaker Nissren Tamam Abdelmoneim Sulieman

机构地区：[1]Department of Computer Science and Engineering,BGC Trust University Bangladesh,Chittagong,4381,Bangladesh [2]Centre for Applied Physics and Radiation Technologies,School of Engineering and Technology,Sunway University,Bandar Sunway,Selangor,47500,Malaysia [3]Department of Physics,College of Sciences,Princess Nourah bint Abdulrahman University,P.O Box 84428,Riyadh,11671,Saudi Arabia [4]Department of Radiology and Medical Imaging,Prince Sattam bin Abdulaziz University,Alkharj,Saudi Arabia

出　　处：《Computers, Materials & Continua》2023年第1期1709-1722,共14页计算机、材料和连续体（英文）

基　　金：Princess Nourah bint Abdulrahman University Researchers Supporting Project(Grant No.PNURSP2022R12),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.

摘　　要：Human speech indirectly represents the mental state or emotion of others.The use of Artificial Intelligence(AI)-based techniques may bring revolution in this modern era by recognizing emotion from speech.In this study,we introduced a robust method for emotion recognition from human speech using a well-performed preprocessing technique together with the deep learning-based mixed model consisting of Long Short-Term Memory(LSTM)and Convolutional Neural Network(CNN).About 2800 audio files were extracted from the Toronto emotional speech set(TESS)database for this study.A high pass and Savitzky Golay Filter have been used to obtain noise-free as well as smooth audio data.A total of seven types of emotions;Angry,Disgust,Fear,Happy,Neutral,Pleasant-surprise,and Sad were used in this study.Energy,Fundamental frequency,and Mel Frequency Cepstral Coefficient(MFCC)have been used to extract the emotion features,and these features resulted in 97.5%accuracy in the mixed LSTM+CNN model.This mixed model is found to be performed better than the usual state-of-the-art models in emotion recognition from speech.It also indicates that this mixed model could be effectively utilized in advanced research dealing with sound processing.

关键词：Emotion recognition Savitzky Golay fundamental frequency MFCC neural networks

分类号：TN912.34[电子电信—通信与信息系统]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

The Efficacy of Deep Learning-Based Mixed Model for Speech Emotion Recognition 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

The Efficacy of Deep Learning-Based Mixed Model for Speech Emotion Recognition 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索