检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:迟春诚 李蔓菁 闫红[2] 李付学[2] CHI Chuncheng;LI Manjing;YAN Hong;LI Fuxue(College of Computer Science and Technology,Shenyang University of Chemical Technology,Shenyang Liaoning 110142,China;College of Electrical Engineering,Yingkou Institute of Technology,Yingkou Liaoning 115014,China)
机构地区:[1]沈阳化工大学计算机科学与技术学院,辽宁沈阳110142 [2]营口理工学院电气工程学院,辽宁营口115014
出 处:《鞍山师范学院学报》2023年第2期64-70,共7页Journal of Anshan Normal University
基 金:辽宁省自然科学基金(2021-YKLH-12;2022-YKLH-18).
摘 要:神经机器翻译在双语资源丰富的场景下,具有良好的性能,但在资源稀缺的情况下,其翻译性能急剧下降.针对稀缺资源翻译任务,本文提出一种基于子树交换的数据增强方法.首先,将目标端句子生成对应的句法树;其次,使用子树交换算法生成新的伪单语数据;最后,利用反向翻译方法生成目标译文,构成伪平行数据.实验结果表明,同基线模型和已有数据增强方法能相比,基于句法子树交换数据增强方法能显著提高模型的翻译性能.Neural machine translation has achieved good performance with a high-resource bilingual corpus.However,the model leads to poor translation quality in the case of low-resource scenarios.For the low-resource translation task,this paper proposes a data augmentation method based on subtree exchange.Firstly,generating the corresponding syntactic tree of the target sentence;secondly,running the subtree exchange algorithm to generate new pseudo-monolingual data.In the end,the back-translation approach is applied to produce the target translation,and this is followed by the production of the pseudo-parallel corpus.Experimental results on several translation tasks show that the data augmentation method based on subtree exchange improves the translation quality significantly compared with the baseline model and existing data augmentation methods.
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15