融合LSSD策略的NAT-Transformer低资源神经机器翻译

Incorporating LSSD Strategy into NAT-Transformer for Low-Resource Neural Machine Translation

作　　者：王方林苏雪平雷一航 WANG Fanglin;SU Xueping;LEI Yihang(Xi’an Engineering University,School of Electronic Information,Xi’an 710000,Shaanxi)

机构地区：[1]西安工程大学,电子信息学院,陕西西安710000

出　　处：《长江信息通信》2025年第2期30-33,共4页Changjiang Information & Communications

基　　金：陕西省自然科学基础研究计划(2024JC-YBMS-455);西安市碑林区应用技术研发储备工程项目(GX2405);陕西高校青年创新团队;陕西省教育厅科研计划项目,24JP071。

摘　　要：随着神经机器翻译技术的不断发展,低资源语言的翻译需求迫切增加。然而,现有的翻译模型在处理低资源语言时往往面临翻译质量不高及推理时间过长的这些问题,文章提出了一种基于LSSD策略的NAT-Transformer算法,在预处理部分,通过获取双语平行语料并应用BPE技术,将数据映射到通用向量空间。在预训练部分,采用了多头自注意力机制以及编码器和解码器堆栈结构,从而能够有效地捕捉序列中的依赖关系。最终得到翻译预测。在这个阶段,我们也加入了LSSD(Language-Specific Self-Distillation)策略,通过SGD(随机梯度下降)进行梯度优化,以提升模型在特定语言上的性能。最后,在微调部分,利用预训练模型进一步调整模型参数。实验结果证明,基于LSSD策略的NAT-Transformer算法在BLEU分数上比基线模型提高了3个BLEU值左右,表现出了较好的应用潜力与性能。As neural machine translation technology continues to advance,the demand for translating low-resource languages is rapidly increasing.However,existing translation models often face challenges such as low translation quality and long inference time when dealing with low-resource languages.To address these issues,this paper proposes an NAT-Transformer algorithm based on the LSSD strategy.In the preprocessing phase,bilingual parallel corpora are obtained and BPE is applied to map the data into a common vector space.In the pre-training phase,a multi-head self-attention mechanism and encoder-decoder stack structure are used to effectively capture dependencies within sequences.Ultimately producing translation predictions.At this stage,we also incorporate the LSSD(Language-Specific Self-Distillation)strategy,optimizing gradients through SGD(stochastic gradient descent)to improve the model’s performance for specific languages.Finally,in the fine-tuning phase,use the pre-trained model to further adjust the model parameters.Experimental results demonstrate that the NAT-Transformer algorithm incorporating the LSSD strategy achieves approximately 3 BLEU points higher than the baseline model,showing promising potential and performance.

关键词：低资源机器翻译 NAT-Transformer 自蒸馏策略

分类号：TP393[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

融合LSSD策略的NAT-Transformer低资源神经机器翻译

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

融合LSSD策略的NAT-Transformer低资源神经机器翻译

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索