基于多任务学习的端到端维吾尔语语音识别  被引量:1

End to End Uyghur Speech Recognition Based on Multi Task Learning

在线阅读下载全文

作  者:苏比·艾依提 努尔麦麦提·尤鲁瓦斯[1,2] 黄浩 吾守尔·斯拉木[1,2] SUBI Aiyiti;NURMEMET Yolwas;HUANG Hao;WUSHOUR Silamu(School of Information Science and Engineering,Xinjiang University,Urumqi,Xinjiang 830046,China;Multi-language Information Technology Laboratory of Xinjiang,Urumqi,Xinjiang 830046,China)

机构地区:[1]新疆大学信息科学与工程学院,新疆乌鲁木齐830046 [2]新疆多语种信息技术实验室,新疆乌鲁木齐830046

出  处:《信号处理》2021年第10期1852-1859,共8页Journal of Signal Processing

基  金:国家自然科学基金(62066043)。

摘  要:维吾尔语是黏着语,词汇量较多,容易出现未登录词问题并且属于低资源语言,导致维吾尔语的端到端语音识别模型性能较低。针对上述问题,该文提出了基于多任务学习的端到端维吾尔语语音识别模型,在编码器层使用Conformer并与链接时序分类(CTC)相连接,通过BPE-dropout方法形成鲁棒性更强的子词,以子词和字作为建模单元,同时进行多任务训练和解码。实验结果分析发现,子词作为建模单元能有效解决未登录词问题,多任务学习模型能在低资源环境下较充分利用数据,学习到丰富的时序语音特征信息,进一步提升模型的识别性能。在公开的维吾尔语语音数据集THUYG-20上与基线相比把子词错误率和字错误率分别降低7.3%和3.8%。Uyghur is an agglutinative language with a large vocabulary,which easily causes the problem of unregistered words.Furthermore,it is also a low-resource language,resulting in low performance of its end-to-end speech recognition model.In order to address foregoing problems,a multi-task learning based end-to-end Uyghur speech recognition model is proposed herein.At the encoder layer,conformer is used and linked to connectionist temporal classification(CTC).By introducing BPE-dropout,more robust modeling units are created.Then,with sub-words and characters as modeling units,multi-task training and decoding are carried out at the same time.The experimental result suggests that the use of sub-word as modeling unit will provide an effective solution to the problem of unregistered words and multi-task learning model achieves full utilization of data in low-resource environment and learn rich time-series speech feature information,thus further promoting the recognition performance of model.In the publicized Uyghur speech data set THUYG-20,the error of sub-words and characters is reduced by 7.3%and 3.8%respectively compared with the baseline.

关 键 词:CONFORMER 链接时序分类 多任务学习 子词 维吾尔语 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象