基于LMD改进特征提取的三路病理语音识别  

Three channel pathological speech recognition based on LMD improved feature extraction

在线阅读下载全文

作  者:张楠 陈媛媛[1] 陈鑫钰 侯懿桃 Zhang Nan;Chen Yuanyuan;Chen Xinyu;Hou Yitao(School of Information and Communication Engineering,North University of China,Taiyuan 030024,China)

机构地区:[1]中北大学信息与通信工程学院,太原030024

出  处:《电子测量技术》2024年第12期140-147,共8页Electronic Measurement Technology

基  金:山西省基础研究计划项目(202203021221103)资助。

摘  要:针对发音障碍患者发音不够清晰准确,导致病理语音识别率低的问题,提出一种基于LMD改进的Gammatone滤波器组图谱特征提取算法进行三路病理语音识别,首先,该算法采用LMD分解语音信号,对分解后的各语音分量做短时傅里叶变换后进行频率合成,提取滤波器组特征及其一阶、二阶差分特征,构成能获取病理语音有效局部特征的LMD-GFbank图谱特征;其次,为了进一步优化网络模型在训练过程中遗漏掉部分有效特征信息,提出一种三路病理语音识别模型;最后,结合语音特征信息进行病理语音识别模型训练和测试。实验结果表明,LMD-GFbank图谱特征在三路病理语音识别模型上的识别率达到了93.36%,优于传统MFCC、GFCC、Fbank特征的语音识别效果,验证了所提算法及识别模型能提升病理语音识别准确率。Aiming at the problem that patients with dysphonia lack clear and accurate pronunciation,which leads to low pathological speech recognition rate,an improved Gammatone Filter Bank map feature extraction algorithm based on LMD is proposed for three channel pathological speech recognition.Firstly,the algorithm uses LMD to decompose speech signals,performs short-time Fourier transform on each decomposed speech component,and synthesizes frequency to extract filter bank features and their first-order and second-order differential features,forming LMD-GFbank map features that can obtain effective local features of pathological speech.Secondly,in order to further improve the problem that the network model will miss some effective feature information during the training process,a three-way pathological speech recognition model is proposed.Finally,the pathological speech recognition model is trained and tested by combining the speech feature information.The experimental results show that the recognition rate of LMD-GFbank map features on the three channel pathological speech recognition model reaches 93.36%,which is better than the speech recognition performance of traditional MFCC,GFCC,and Fbank features,and verified that the proposed algorithm and recognition model can improve the accuracy of pathological speech recognition.

关 键 词:发音障碍 局部均值分解 病理语音识别 特征提取 

分 类 号:TN912.34[电子电信—通信与信息系统] R741[电子电信—信息与通信工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象