LKMT:Linguistics Knowledge-Driven Multi-Task Neural Machine Translation for Urdu and English  

在线阅读下载全文

作  者:Muhammad Naeem Ul Hassan Zhengtao Yu Jian Wang Ying Li Shengxiang Gao Shuwan Yang Cunli Mao 

机构地区:[1]Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming,650500,China [2]Yunnan Key Laboratory of Artificial Intelligence,Kunming University of Science and Technology,Kunming,650500,China

出  处:《Computers, Materials & Continua》2024年第10期951-969,共19页计算机、材料和连续体(英文)

基  金:supported by the National Natural Science Foundation of China under Grant(61732005,61972186);Yunnan Provincial Major Science and Technology Special Plan Projects(Nos.202103AA080015,202203AA080004).

摘  要:Thanks to the strong representation capability of pre-trained language models,supervised machine translation models have achieved outstanding performance.However,the performances of these models drop sharply when the scale of the parallel training corpus is limited.Considering the pre-trained language model has a strong ability for monolingual representation,it is the key challenge for machine translation to construct the in-depth relationship between the source and target language by injecting the lexical and syntactic information into pre-trained language models.To alleviate the dependence on the parallel corpus,we propose a Linguistics Knowledge-Driven MultiTask(LKMT)approach to inject part-of-speech and syntactic knowledge into pre-trained models,thus enhancing the machine translation performance.On the one hand,we integrate part-of-speech and dependency labels into the embedding layer and exploit large-scale monolingual corpus to update all parameters of pre-trained language models,thus ensuring the updated language model contains potential lexical and syntactic information.On the other hand,we leverage an extra self-attention layer to explicitly inject linguistic knowledge into the pre-trained language model-enhanced machine translation model.Experiments on the benchmark dataset show that our proposed LKMT approach improves the Urdu-English translation accuracy by 1.97 points and the English-Urdu translation accuracy by 2.42 points,highlighting the effectiveness of our LKMT framework.Detailed ablation experiments confirm the positive impact of part-of-speech and dependency parsing on machine translation.

关 键 词:Urdu NMT(neural machine translation) Urdu natural language processing Urdu Linguistic features low resources language linguistic features pretrain model 

分 类 号:TP391.2[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象