融合主题信息的篇章级神经机器翻译

Document-level neural machine translation based on topic information

作　　者：陈玺文余正涛[1,2] 高盛祥[1,2] 王振晗 CHEN Xi-wen;YU Zheng-tao;GAO Sheng-xiang;WANG Zhen-han(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,Yunnan,China;Yunnan Key Laboratory of Artificial Intelligence,Kunming University of Science and Technology,Kunming 650500,Yunnan,China)

机构地区：[1]昆明理工大学信息工程与自动化学院,云南昆明650500 [2]昆明理工大学云南省人工智能重点实验室,云南昆明650500

出　　处：《云南大学学报（自然科学版）》2023年第6期1197-1207,共11页Journal of Yunnan University(Natural Sciences Edition)

基　　金：国家自然科学基金(61972186);云南省重大科技专项(202103AA080015,202203AA080004).

摘　　要：目前的神经机器翻译方法以句子为单位作为输入,在翻译过程中不能有效利用篇章级上下文的信息,影响了机器翻译的性能.为解决现有机器翻译框架下的上下文信息缺失问题,提出一种融合主题信息的篇章级神经机器翻译方法.首先,将源语言当前句子与源语言的前一句分别独立输入到源语言句子编码器和上下文编码器中;然后,采用注意力机制将2个编码器的输出映射为最终的上下文表,结合源语言句子编码器输出通过门控机制得到具有上下文信息和当前句子融合表征,同时将词嵌入后的源语言句子输入基于Bi-GRU和卷积神经网络的主题表征编码器映射为主题表征;最后,将融合后的句子表征以及主题表征分别通过2个串联的注意力机制参与解码.实验结果表明,该方法能够提高篇章级神经机器翻译的性能,相较于基准系统,该方法在BLEU值上最高提升了0.55个百分点.At present,Neural Machine Translation(NMT)methods take sentences as the unit to input,the context information cannot be effectively utilized in the translation process,which affects the performance of machine translation.In order to solve this problem,this study proposes a document-level neural machine translation method that integrates topic information.In this method,firstly,it takes the source and the context sentence into the source encoder and the context encoder independently,and then uses the attention mechanism to map the outputs of the two encoders into the context representation.The context representation is combined with the source encoder output to obtain a fusion representation through a gating mechanism.At the same time,the source sentence after word embedding is mapped to the topic representation through the topic encoder based on Bi-GRU and Convolutional Neural Networks.Finally,the fusion representation and topic representation are feed into decoder through two serial attention mechanisms,respectively.Experiments show that this method can improve the performance of document-level neural machine translation,and this method achieved by up to 0.55 percentage points in BLEU compared to baseline system.

关键词：篇章翻译神经机器翻译主题模型双编码器句子表征

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

融合主题信息的篇章级神经机器翻译

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

融合主题信息的篇章级神经机器翻译

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索