A Novel Chinese-English Neural Machine Translation Model Based on BERT  

在线阅读下载全文

作  者:Linlin Zhang 

机构地区:[1]School of Foreign Languages,Shenyang Normal University 110034 Shenyang,China

出  处:《IJLAI Transactions on Science and Engineering》2025年第1期59-65,共7页IJLAI科学与工程学报汇刊(英文)

基  金:supported by the Project of the Research Center for the Theory System of Socialism with Chinese Characteristics in Shenyang Normal University,2023.Project number:ZTSYB2023012.Project name:A Study of Xi Jinping Thought on Socialism with Chinese Characteristics in the Course of Foreign Affairs Translation.

摘  要:In recent years,neural machine translation has rapidly developed and replaced traditional machine translation,becoming the mainstream paradigm in the field of machine translation.Machine translation can reduce translation costs and improve translation efficiency,bring good news to cultural exchanges and international cooperation,and help national development.However,neural machine translation is highly dependent on large-scale high-quality parallel corpus,and there are problems such as uneven quality and sparse data,so it is imperative to study and explore neural machine translation.The purpose of this paper is to construct pseudo-parallel corpus using data enhancement technology,improve the diversity of Chinese and English materials,and then optimize the translation model to improve the translation effect of the model.Based on BERT pre-training technology,this paper first analyzes the limitations of the traditional Transformer model,and then puts forward two directions for model optimization.On the one hand,in the data preprocessing stage,multi-granularity word segmentation technology is used for word segmentation to help Chinese-English neural machine translation model better understand the text.On the other hand,in the pre-training stage,this paper adopts the strategy of deep integration of BERT dynamic word embedding and original word embedding.On the basis of the original Transformer,a fusion module is added,through which the original word embeddings and BERT dynamic word embeddings are simple linear splicing,and then fed into the encoder.The attention mechanism is used for deep integration and better word vector representation,enabling the Transformer model to take full advantage of the external semantic information introduced by BERT.Finally,the feasibility and effectiveness of the Transformer architecture adopted in this paper are verified by the comparison experiment between RNN and Transformer model.Through the ablation experiments of different word vector representation and different stages using BERT

关 键 词:TRANSFORMER Chinese-English Neural Machine Translation BERT Multi-granularity word segmentation technology 

分 类 号:TP3[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象