基于门控卷积网络与CTC的端到端语音识别  被引量:15

End-to-end speech recognition based on gated convolutional neural network and CTC

在线阅读下载全文

作  者:杨德举 马良荔[1] 谭琳珊 裴晶晶 YANG De-ju;MA Liang-li;TAN Lin-shan;PEI Jing-jing(College of Electronic Engineering,Naval University of Engineering,Wuhan 430033,China;91001 PLA Troops,Beijing 100841,China)

机构地区:[1]海军工程大学电子工程学院,湖北武汉430033 [2]中国人民解放军91001部队,北京100841

出  处:《计算机工程与设计》2020年第9期2650-2654,共5页Computer Engineering and Design

摘  要:针对传统声学模型存在模型组件复杂且不能统一进行训练,数据必须进行预对齐的问题,提出基于一维门控卷积神经网络与CTC的中文端到端语音识别模型。通过堆叠多层一维卷积神经网络进行声学建模,提取包含上下文信息的高层抽象特征,融合门控线性单元减少梯度弥散,利用CTC算法实现以汉字字符作为建模基元的端到端训练和解码。在公开数据集上的实验结果表明,与基线模型相比,该模型语音识别性能有明显提升,字错误率降低了3.3%以上。Aiming at the problems that traditional acoustic models have complex model components and cannot be trained uniformly, and that data must be pre-aligned, an end-to-end Chinese speech recognition model based on one-dimensional gated con-volutional neural network and CTC was proposed. Acoustic modeling was carried out by stacking multilayer one-dimensional convolutional neural network to extract high-level abstract features containing context information, gated linear units were fused to reduce gradient dispersion, and end-to-end training and decoding of Chinese characters as modeling primitives were realized by using CTC algorithm. Experimental results on the public data set show that the speech recognition performance of the model is improved significantly compared with the baseline model, and the character error rate is reduced by more than 3.3%.

关 键 词:语音识别 端到端 卷积神经网络 门控线性单元 链接时序分类 

分 类 号:TP391.42[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象