基于双向循环神经网络的汉语语音识别  被引量:9

A study of Chinese speech recognition based on bidirectional recurrent neural network

在线阅读下载全文

作  者:李鹏[1] 杨元维 高贤君[1] 杜李慧 周意 蒋梦月 张净波 LI Peng;YANG Yuanwei;GAO Xianjun;DU Lihui;ZHOU Yi;JANG Mengyue;ZHANG Jingbo(College of Geosciences,Yangtze University,Wuhan 430100,China)

机构地区:[1]长江大学地球科学学院,武汉430100

出  处:《应用声学》2020年第3期464-471,共8页Journal of Applied Acoustics

基  金:湖北省教育厅科学研究计划资助项目(Q20181317);长江大学大学生创新创业基金项目(2018012);地理国情监测国家测绘地理信息局重点实验室开发基金项目(2017NGCM07)。

摘  要:当前基于深度神经网络模型中,虽然其隐含层可设置多层,对复杂问题适应能力强,但每层之间的节点连接是相互独立的,这种结构特性导致了在语音序列中无法利用上下文相关信息来提高识别效果,而传统的循环神经网络虽然做出了改进,但是只能对上文信息进行利用。针对以上问题,该文采用可以同时利用语音序列中上下文相关信息的双向循环神经网络模型与深度神经网络模型相结合,并应用于语音识别。构建具有5层隐含层的模型,其中第3层为双向循环神经网络结构,其他层采用深度神经网络结构。实验结果表明:加入了双向循环神经网络结构的模型与其他模型相比,较好地提高了识别正确率;噪声对双向循环神经网络汉语识别有重要影响,尤其是训练集和测试集附加噪声类型不同时,单一的含噪声语音的训练模型无法适应不同噪声类型的语音识别;调整神经网络模型中隐含层神经元数量后,识别正确率并不是一直随着隐含层中神经元数量的增加而增加,神经元数量数目增加到一定程度后正确率出现了降低的趋势。Within deep neural network(DNN)models,the hidden layer can be set up multi-level,adaptable to complicated problem,but the node connected between each layer is independent of each other,the structure characteristics make it impossible to use contextual information in the speech sequence to improve the effect of recognition,and while a traditional recurrent neural network(RNN)has made the improvement,but only to use the above information.To solve the above problems,the bidirectional RNN(Bi-RNN)model and DNN model were combined in this paper,which can simultaneously utilize the context-related information in speech sequences,and apply them to speech recognition.A model with five hidden layers was constructed,in which the third layer was Bi-RNN structure and the other layers were DNN structure.The experimental results show that:compared with other models,the model with Bi-RNN structure improves the recognition accuracy.Noise plays an important role in Bi-RNN Chinese language recognition.In particular,the training set and test set have different types of additional noise.After adjusting the number of neurons in the hidden layer in the neural network model,the recognition accuracy does not always increase with the increase of the number of neurons in the hidden layer,but decreases after the number of neurons increases to a certain extent.

关 键 词:语音识别 深度学习 深度神经网络 循环神经网络 

分 类 号:TN912.3[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象