Improving neural sentence alignment with word translation  被引量:2

在线阅读下载全文

作  者:Ying DING Junhui LI Zhengxian GONG Guodong ZHOU 

机构地区:[1]School of Computer Science and Technology,Soochow University,Suzhou 215006,China

出  处:《Frontiers of Computer Science》2021年第1期81-90,共10页中国计算机科学前沿(英文版)

基  金:This work was supported by the National Natural Science Foundation of China(Grant Nos.61876120,61673290).

摘  要:Sentence alignment is a basic task in natural lan-guage processing which aims to extract high-quality paral-lel sentences automatically.Motivated by the observation that aligned sentence pairs contain a larger number of aligned words than unaligned ones,we treat word translation as one of the most useful external knowledge.In this paper,we show how to explicitly integrate word translation into neural sentence alignment.Specifically,this paper proposes three cross-lingual encoders to incorporate word translation:1)Mixed Encoder that learns words and their translation annotation vectors over sequences where words and their translations are mixed alterma-tively;2)Factored Encoder that views word translations as fea-tures and encodes words and their translations by concatenating their embeddings;and 3)Gated Encoder that uses gate mechanism to selectively control the amount of word translations moving forward.Experimentation on NIST MT and Opensub-titles Chinese-English datasets on both non-monotonicity and monotonicity scenarios demonstrates that all the proposed encoders significantly improve sentence alignment performance.

关 键 词:sentence alignment word translation mixeden coder factored encoder gated encoder 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象