检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:谷雪鹏 郭军军 余正涛[1,2] GU Xuepeng;GUO Junjun;YU Zhengtao(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China;Yunnan Key Laboratory of Artificial Intelligence,Kunming University of Science and Technol,Kunming 650500,China)
机构地区:[1]昆明理工大学信息工程与自动化学院,云南昆明650500 [2]昆明理工大学云南省人工智能重点实验室,云南昆明650500
出 处:《厦门大学学报(自然科学版)》2024年第6期1024-1032,共9页Journal of Xiamen University:Natural Science
基 金:国家重点研发计划(2020AAA0107904);国家自然科学基金(62366025);云南省自然科学基金(2019FB082,2019QY1801);云南省重大科技专项(202002AD080001,202103AA080015,202202AD080003)。
摘 要:[目的]针对在神经机器翻译任务中仅使用微调的方法不能充分利用预训练语言知识的问题进行研究.[方法]提出一种双阶段交互融合预训练模型的神经机器翻译方法.首先提取BERT预训练模型的多层表征,利用多层表征构建掩码知识矩阵,将BERT包含的预训练知识作用于神经机器翻译模型编码端词嵌入层.其次,通过自适应融合模块提取BERT多层表征中的有益知识,并与神经机器翻译模型交互融合.[结果]实验结果表明,与Transformer基线模型相比,所提方法在多个神经机器翻译任务上BLEU评分获得了1.41~4.20的提升,相较于其他融合预训练知识的神经机器翻译方法,所提方法也有较为明显的模型性能提升.[结论]本文提出的双阶段交互融合预训练模型的神经机器翻译方法缓解了灾难性遗忘问题,缩小了预训练模型与神经机器翻译模型因训练目标不同而导致的差异,可以有效利用预训练语言知识来提升神经机器翻译模型性能.[Objective]To study the problem that only fine-tuning method can not make full use of pre-trained language knowledge in neural machine translation task.[Methods]A neural machine translation method based on two-stage interactive fusion of pre-trained models is proposed.First,the multi-layer representation of BERT pre-trained model is extracted,then the mask knowledge matrix is constructed by using the multi-layer representation,and the pre-training knowledge contained in BERT is applied to the encoding word embedding layer of neural machine translation model.Second,the beneficial knowledge obtained from BERT multilayer representation is extracted by adaptive fusion module and interactively fused with neural machine translation model.[Results]Experimental results show that,compared with Transformer baseline model,the proposed method achieves an improvement of BLEU score of 1.41~4.20 in multiple neural machine translation tasks.Compared with other neural machine translation methods that integrate pre-training knowledge,the proposed method also secures a significant model performance improvement.[Conclusion]The neural machine translation method proposed herein,which combines pre-trained models with two-stage interaction,resolves the problem of catastrophic forgetting,reduces the difference between pre-trained models and neural machine translation models due to different training objectives,and can effectively use pre-trained language knowledge to improve the performance of neural machine translation models.
关 键 词:机器翻译 预训练语言模型 注意力机制 Transformer网络模型
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.33