Investigation of Knowledge Transfer Approaches to Improve the Acoustic Modeling of Vietnamese ASR System 被引量：5

Investigation of Knowledge Transfer Approaches to Improve the Acoustic Modeling of Vietnamese ASR System

作　　者：Danyang Liu Ji Xu Pengyuan Zhang Yonghong Yan

机构地区：[1]the Key Laboratory of Speech Acoustics and Content Understanding,Institute of Acoustics,Chinese Academy of Sciences,Beijing 100190,China [2]the School of Electronic,Electrical and Communication Engineering,University of Chinese Academy of Sciences,Beijing 101408,China [3]with Xinjiang Laboratory of Minority Speech and Language Information Processing,Xinjiang Technical Institute of Physics and Chemistry,Chinese Academy of Sciences,Urumqi 830011,China

出　　处：《IEEE/CAA Journal of Automatica Sinica》2019年第5期1187-1195,共9页自动化学报（英文版）

基　　金：partially supported by the National Natural Science Foundation of China(11590770-4,U1536117);the National Key Research and Development Program of China(2016YFB0801203,2016YFB0801200);the Key Science and Technology Project of the Xinjiang Uygur Autonomous Region(2016A03007-1);the Pre-research Project for Equipment of General Information System(JZX2017-0994/Y306)

摘　　要：It is well known that automatic speech recognition(ASR) is a resource consuming task. It takes sufficient amount of data to train a state-of-the-art deep neural network acoustic model. As for some low-resource languages where scripted speech is difficult to obtain, data sparsity is the main problem that limits the performance of speech recognition system. In this paper, several knowledge transfer methods are investigated to overcome the data sparsity problem with the help of high-resource languages.The first one is a pre-training and fine-tuning(PT/FT) method, in which the parameters of hidden layers are initialized with a welltrained neural network. Secondly, the progressive neural networks(Prognets) are investigated. With the help of lateral connections in the network architecture, Prognets are immune to forgetting effect and superior in knowledge transferring. Finally,bottleneck features(BNF) are extracted using cross-lingual deep neural networks and serves as an enhanced feature to improve the performance of ASR system. Experiments are conducted in a low-resource Vietnamese dataset. The results show that all three methods yield significant gains over the baseline system, and the Prognets acoustic model performs the best. Further improvements can be obtained by combining the Prognets model and bottleneck features.It is well known that automatic speech recognition(ASR) is a resource consuming task. It takes sufficient amount of data to train a state-of-the-art deep neural network acoustic model. As for some low-resource languages where scripted speech is difficult to obtain, data sparsity is the main problem that limits the performance of speech recognition system. In this paper, several knowledge transfer methods are investigated to overcome the data sparsity problem with the help of high-resource languages.The first one is a pre-training and fine-tuning(PT/FT) method, in which the parameters of hidden layers are initialized with a welltrained neural network. Secondly, the progressive neural networks(Prognets) are investigated. With the help of lateral connections in the network architecture, Prognets are immune to forgetting effect and superior in knowledge transferring. Finally,bottleneck features(BNF) are extracted using cross-lingual deep neural networks and serves as an enhanced feature to improve the performance of ASR system. Experiments are conducted in a low-resource Vietnamese dataset. The results show that all three methods yield significant gains over the baseline system, and the Prognets acoustic model performs the best. Further improvements can be obtained by combining the Prognets model and bottleneck features.

关键词：BOTTLENECK feature (BNF) cross-lingual automatic speech recognition (ASR) PROGRESSIVE neural networks (Prognets) model transfer learning

分类号：TP[自动化与计算机技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

Investigation of Knowledge Transfer Approaches to Improve the Acoustic Modeling of Vietnamese ASR System 被引量：5

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

Investigation of Knowledge Transfer Approaches to Improve the Acoustic Modeling of Vietnamese ASR System 被引量：5

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索