融合BERT预训练语言知识的神经机器翻译方法

Neural machine translation method integrating BERT’s pre-trained language knowledge

作　　者：谷雪鹏郭军军余正涛[1,2] GU Xuepeng;GUO Junjun;YU Zhengtao(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China;Yunnan Key Laboratory of Artificial Intelligence,Kunming University of Science and Technol,Kunming 650500,China)

机构地区：[1]昆明理工大学信息工程与自动化学院,云南昆明650500 [2]昆明理工大学云南省人工智能重点实验室,云南昆明650500

出　　处：《厦门大学学报（自然科学版）》2024年第6期1024-1032,共9页Journal of Xiamen University：Natural Science

基　　金：国家重点研发计划(2020AAA0107904);国家自然科学基金(62366025);云南省自然科学基金(2019FB082,2019QY1801);云南省重大科技专项(202002AD080001,202103AA080015,202202AD080003)。

摘　　要：[目的]针对在神经机器翻译任务中仅使用微调的方法不能充分利用预训练语言知识的问题进行研究.[方法]提出一种双阶段交互融合预训练模型的神经机器翻译方法.首先提取BERT预训练模型的多层表征,利用多层表征构建掩码知识矩阵,将BERT包含的预训练知识作用于神经机器翻译模型编码端词嵌入层.其次,通过自适应融合模块提取BERT多层表征中的有益知识,并与神经机器翻译模型交互融合.[结果]实验结果表明,与Transformer基线模型相比,所提方法在多个神经机器翻译任务上BLEU评分获得了1.41~4.20的提升,相较于其他融合预训练知识的神经机器翻译方法,所提方法也有较为明显的模型性能提升.[结论]本文提出的双阶段交互融合预训练模型的神经机器翻译方法缓解了灾难性遗忘问题,缩小了预训练模型与神经机器翻译模型因训练目标不同而导致的差异,可以有效利用预训练语言知识来提升神经机器翻译模型性能.[Objective]To study the problem that only fine-tuning method can not make full use of pre-trained language knowledge in neural machine translation task.[Methods]A neural machine translation method based on two-stage interactive fusion of pre-trained models is proposed.First,the multi-layer representation of BERT pre-trained model is extracted,then the mask knowledge matrix is constructed by using the multi-layer representation,and the pre-training knowledge contained in BERT is applied to the encoding word embedding layer of neural machine translation model.Second,the beneficial knowledge obtained from BERT multilayer representation is extracted by adaptive fusion module and interactively fused with neural machine translation model.[Results]Experimental results show that,compared with Transformer baseline model,the proposed method achieves an improvement of BLEU score of 1.41~4.20 in multiple neural machine translation tasks.Compared with other neural machine translation methods that integrate pre-training knowledge,the proposed method also secures a significant model performance improvement.[Conclusion]The neural machine translation method proposed herein,which combines pre-trained models with two-stage interaction,resolves the problem of catastrophic forgetting,reduces the difference between pre-trained models and neural machine translation models due to different training objectives,and can effectively use pre-trained language knowledge to improve the performance of neural machine translation models.

关键词：机器翻译预训练语言模型注意力机制 Transformer网络模型

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

融合BERT预训练语言知识的神经机器翻译方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

融合BERT预训练语言知识的神经机器翻译方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索