Long Short-Term Memory Recurrent Neural Network-Based Acoustic Model Using Connectionist Temporal Classification on a Large-Scale Training Corpus  被引量:9

Long Short-Term Memory Recurrent Neural Network-Based Acoustic Model Using Connectionist Temporal Classification on a Large-Scale Training Corpus

在线阅读下载全文

作  者:Donghyun Lee Minkyu Lim Hosung Park Yoseb Kang Jeong-Sik Park Gil-Jin Jang Ji-Hwan Kim 

机构地区:[1]Department of Computer Science and Engineering,Sogang University [2]Department of Information and Communication Engineering,Yeungnam University [3]School of Electronics Engineering,Kyungpook National University

出  处:《China Communications》2017年第9期23-31,共9页中国通信(英文版)

基  金:supported by the Ministry of Trade,Industry & Energy(MOTIE,Korea) under Industrial Technology Innovation Program (No.10063424,'development of distant speech recognition and multi-task dialog processing technologies for in-door conversational robots')

摘  要:A Long Short-Term Memory(LSTM) Recurrent Neural Network(RNN) has driven tremendous improvements on an acoustic model based on Gaussian Mixture Model(GMM). However, these models based on a hybrid method require a forced aligned Hidden Markov Model(HMM) state sequence obtained from the GMM-based acoustic model. Therefore, it requires a long computation time for training both the GMM-based acoustic model and a deep learning-based acoustic model. In order to solve this problem, an acoustic model using CTC algorithm is proposed. CTC algorithm does not require the GMM-based acoustic model because it does not use the forced aligned HMM state sequence. However, previous works on a LSTM RNN-based acoustic model using CTC used a small-scale training corpus. In this paper, the LSTM RNN-based acoustic model using CTC is trained on a large-scale training corpus and its performance is evaluated. The implemented acoustic model has a performance of 6.18% and 15.01% in terms of Word Error Rate(WER) for clean speech and noisy speech, respectively. This is similar to a performance of the acoustic model based on the hybrid method.A Long Short-Term Memory(LSTM) Recurrent Neural Network(RNN) has driven tremendous improvements on an acoustic model based on Gaussian Mixture Model(GMM). However, these models based on a hybrid method require a forced aligned Hidden Markov Model(HMM) state sequence obtained from the GMM-based acoustic model. Therefore, it requires a long computation time for training both the GMM-based acoustic model and a deep learning-based acoustic model. In order to solve this problem, an acoustic model using CTC algorithm is proposed. CTC algorithm does not require the GMM-based acoustic model because it does not use the forced aligned HMM state sequence. However, previous works on a LSTM RNN-based acoustic model using CTC used a small-scale training corpus. In this paper, the LSTM RNN-based acoustic model using CTC is trained on a large-scale training corpus and its performance is evaluated. The implemented acoustic model has a performance of 6.18% and 15.01% in terms of Word Error Rate(WER) for clean speech and noisy speech, respectively. This is similar to a performance of the acoustic model based on the hybrid method.

关 键 词:acoustic model connectionisttemporal classification LARGE-SCALE trainingcorpus LONG SHORT-TERM memory recurrentneural network 

分 类 号:TN912.3[电子电信—通信与信息系统] TP183[电子电信—信息与通信工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象