检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王剑[1] 张莹 余正涛[1,2] 黄于欣[1,2] WANG Jian;ZHANG Ying;YU Zheng-tao;HUANG Yu-xin(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China;Yunnan Key Laboratory of Artificial Intelligence,Kunming University of Science and Technology,Kunming 650500,China)
机构地区:[1]昆明理工大学信息工程与自动化学院,昆明650500 [2]昆明理工大学云南省人工智能重点实验室,昆明650500
出 处:《小型微型计算机系统》2022年第5期992-997,共6页Journal of Chinese Computer Systems
基 金:国家重点研发计划项目(2018YFC0830105,2018YFC0830101,2018YFC0830100)资助;国家自然科学基金项目(61972186,61762056,61472168)资助;云南省重大科技专项计划项目(202002AD080001)资助;云南省高新技术产业专项(201606)资助。
摘 要:是将输入的源语言文本生成目标语言摘要的过程.目前跨语言摘要任务大多是借助于机器翻译,而针对越南语这类低资源语言,机器翻译效果不佳是汉越跨语言摘要面临的挑战.针对该问题,提出了一种基于词对齐的半监督对抗学习汉越跨语言摘要生成方法,其思想是将汉越双语对齐到同一空间,得到对齐的双语特征,然后同时利用双语特征生成跨语言摘要.具体来讲,基于编解码框架,首先利用Bert编码器分别对输入的汉越文本进行向量表征;然后基于汉越双语词典的半监督对抗学习方法,实现双语词向量在同一语义空间对齐;最后基于注意力机制同时关注双语上下文向量,解码得到目标语言摘要.在收集的汉越摘要数据集上的实验结果表明,该方法可以有效提升汉越跨语言摘要模型的性能.Cross-lingual summarization is the process of generating a summary text in the target language from a text input in the source language.Currently,most cross-lingual summarization tasks rely on machine translation.For low-resource languages such as Vietnamese,poor translation is a challenge for Chinese-Vietnamese cross-language summarization.In response to the challenge,this model proposes a method of generating cross-lingual summarization under the semi-supervised adversarial learning framework based on word alignment.The idea is to align the Chinese and Vietnamese bilinguals into the same semantic space to obtain aligned bilingual features,and then simultaneously use the bilingual features to generate cross-language abstracts.The model first adopts the Bert encoder to represent the input Chinese and Vietnamese text respectively,then conducts semi-supervised adversarial learning based on a Chinese-Vietnamese bilingual dictionary to realize the alignment of bilingual word vectors in the same semantic space.Ultimately,based on the attention mechanism,the bilingual context vector is simultaneously focused,and the target language abstract is obtained by the transformer decoder.The experimental results show that the collected Chinese-Vietnamese abstract datasets indicate that our method effectively improves the performance of Chinese-Vietnamese cross-lingual summarization.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222