民航陆空通话语音识别BiLSTM网络模型  被引量:8

Speech Recognition Model of Civil Aviation Radiotelephony Communication Based on BiLSTM

在线阅读下载全文

作  者:邱意 贾桂敏[1] 杨金锋[1] 刘远庆 QIU Yi;JIA Gui-min;YANG Jin-feng;LIU Yuan-qing(Tianjin Key Lab for Advanced Signal Processing,Civil Aviation University of China,Tianjin 300300,China)

机构地区:[1]中国民航大学天津市智能信号与图像处理重点实验室,天津300300

出  处:《信号处理》2019年第2期293-300,共8页Journal of Signal Processing

基  金:国家自然科学基金(U1433120;61502498);中央高校基本科研业务费资助项目(ZYGX2018042)

摘  要:民航陆空通话对民航飞行安全十分重要,但因其通话模式有特殊的语法结构与发音方式,日常语音识别声学模型无法有效应用于民航陆空通话的语音处理问题。针对民航陆空通话的特殊语境,本文提出了基于双向长短时记忆网络(BiLSTM)的民航陆空通话语音识别方法。首先,提取民航陆空通话语音的FBANK特征作为输入,以时序链式连接(CTC)为目标函数,训练BiLSTM网络得到BiLSTM/CTC模型。然后,利用声学模型,语言模型与陆空通话词典实现民航陆空通话的语音识别,并结合数据增强与数据迁移对模型进行增强训练提高语音识别性能。实验结果表明本文提出的方法适用于民航陆空通话语音识别,并且数据增强模型可有效降低民航陆空通话语音识别的词错误率。The radiotelephony communication is crucial for flight safety in civil aviation.The special grammatical structure and pronunciation in civil aviation radiotelephony communication makes the traditional acoustic model of speech recognition not suitable for civil aviation radiotelephony communication context.In order to model the acoustic pattern of radiotelephony communication of civil aviation,a speech recognition method based on Bidirectional Long Short-Term Memory(BiLSTM)neural networks is proposed in this paper.First,the FBANK acoustic feature that extracted from speech dataset of civil aviation radiotelephony communication is as input and the connectionist temporal classification(CTC)objective function is used for training multi-layer BiLSTM neural networks.Then,using the BiLSTM/CTC acoustic model,language model and lexicon to realize the auto speech recognition of civil aviation radiotelephony communication.Based on the combination of data augmentation and data migration,the BiLSTM/CTC acoustic model is trained and enhanced to improve speech recognition performance.Experimental results show that the proposed methods are suitable for auto speech recognition in radiotelephony communication of civil aviation,and the data enhancement model can effectively reduce the word error rate.

关 键 词:民航陆空通话 语音识别 双向长短时记忆网络 数据增强 数据迁移 

分 类 号:TP391.3[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象