基于深度卷积神经网络的端到端语音识别方法研究  被引量:1

Research on End-to-end Voice Recognition Method Based on Deep Convolutional Neural Networks

在线阅读下载全文

作  者:李瑾辉 张国梁 苏杨 朱晓鸿 王鑫 LI Jin-hui;ZHANG Guo-liang;SU Yang;ZHU Xiao-hong;WANG Xin(Communication Control Center,State Grid Jiangsu Electric Power Co.,Ltd.,Information Communication Branch,Nanjing 210024 China)

机构地区:[1]国网江苏省电力有限公司信息通信分公司信通调控中心,江苏南京210024

出  处:《自动化技术与应用》2024年第6期55-59,共5页Techniques of Automation and Applications

基  金:2020年江苏省自然科学重点研究项目(KJ2020A1098)。

摘  要:端到端语音处于直接通信环境,缺少加密过程,语音信息传输过程中存在一定的干扰,导致信号特征提取较为困难,为此提出基于深度卷积神经网络的语音识别方法的研究。首先基于尺度噪声能量估计方法完成语音去噪处理;其次,通过聚合经验模态分解方法提取语音特征信息;最后,使用残差网络优化深度卷积神经网络模型,并完成端到端的语音识别。实验结果表明,所提方法在无噪声添加和有噪声添加的情况下,端到端语音识别词错率最大值分别为10%、12%,表明该方法能够高效、准确实现端到端语音识别,具有较高的实际应用价值。End-to-end speech is in direct communication environment,lack of encryption process,there is a certain interference in the process of speech information transmission,which leads to the difficulty of signal feature extraction.Therefore,a speech recognition method based on deep convolutional neural network is proposed.Firstly,the scale noise energy estimation method is used for speech denoising.Secondly,the speech feature information is extracted by the aggregated empirical mode decomposition method.Finally,the residual network is used to optimize the deep convolutional neural network model and complete the end-to-end speech recognition.The experimental results show that the maximum error rate of the proposed method is 10%and 12%respectively under the condition of no noise addition and with noise addition,which indicates that the proposed method can realize the end-to-end speech recognition efficiently and accurately,and has high practical application value.

关 键 词:语音识别 语音去噪 端到端 深度卷积神经网络 聚合经验模态分解 

分 类 号:TP391.42[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象