融合时态特征的日英层次短语翻译模型  

A Japanese-English Hierarchical Phrase-based Translation Model Integrating Tense Features

在线阅读下载全文

作  者:明芳[1] 徐金安[1] 王楠[1] 陈钰枫[1] 张玉洁[1] 

机构地区:[1]北京交通大学计算机与信息技术学院,北京100044

出  处:《计算机与现代化》2017年第6期1-7,共7页Computer and Modernization

基  金:国家自然科学基金资助项目(61370130;61473294);中央高校基本科研业务费专项资金资助项目(2015JBM033);科学技术部国际科技合作计划项目(K11F100010)

摘  要:针对基于层次短语翻译模型的统计机器翻译使用上下文信息有限,时态翻译质量不高的问题,提出一种融合时态特征的日英统计机器翻译方法。该方法通过引入翻译规则的时态分类约束信息,解码器可以根据每条规则的潜在时态分类,为相应时态的句子匹配到最合适的规则进行翻译。首先从双语训练语料中抽取时态特征构建最大熵分类模型,然后再抽取包含各类时态信息的层次短语规则的时态特征,最后将规则的时态分类结果作为一类新特征,融入基于层次短语的翻译系统中。实验结果表明,与基线系统相比,该方法在多个测试集上提高了翻译质量,在一定程度上解决了日英层次短语模型的时态翻译问题。In view of the problem that limited contextual information is used in the hierarchical phrase-based (HPB) translation model and the quality of tense translation is not high, this paper proposes a method to integrate tense features into Japanese-Eng- lish HPB translation. Our method adopts the information of tense as constraints for tense classification model construction, and in- tegrates tense features into HPB translation model, the decoder can get the best-matching rules according to the results of potential tense classification of rules. Firstly, we extract training data from bilingual training corpus to train tense classification models by using maximum entropy. Secondly, we extract tense features from hierarchy phrase rules to classify each kind of rules which in- clude tense information, then we take the tense classification results as a kind of new translation features, and integrate the fea- tures into hierarchy phrase-based translation model. The experimental results show that our method can achieve good performance in Japanese-English HPB translation.

关 键 词:层次短语翻译模型 时态特征 最大熵分类模型 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象