基于统计和词典方法相结合的韩汉双语语料库名词短语对齐  被引量:4

Noun Phrase Alignment in the Korean-Chinese Bilingual Corpus Based on Statistics and Lexicon

在线阅读下载全文

作  者:凌天斌 毕玉德 LING Tianbin;BI Yude(The PLA Strategic Support Force Information Engineering University,Luoyang,Henan 471003,China)

机构地区:[1]解放军战略支援部队信息工程大学,河南洛阳471003

出  处:《中文信息学报》2018年第8期27-31,共5页Journal of Chinese Information Processing

摘  要:韩汉双语语料库短语对齐对于基于实例的韩汉机器翻译系统具有重要意义,该文从韩国语名词短语结构特点出发,在基于统计和基于词典的词对齐方法进行试验分析的基础上,提出了基于词对齐位置信息的韩汉双语语料库名词短语对齐方法。该方法通过基于统计的方法获得词对齐位置信息,在此基础上利用基于词典方法的相似度计算进行词对齐校正;根据以上结果,该文通过韩国语名词短语左右边界规则抽取名词短语及其汉语译文,利用关联度度量方法进行过滤,实现名词短语对齐。实验结果表明,在较大规模语料库情况下,该方法取得了较好的短语对齐结果。Phrase alignment in a bilingual corpus is of great significance to the example-based Korean-Chinese machine translation system.This paper begins with a study of the structural features of Korean noun phrases,conducts an experimental analysis of the statistics-and lexicon-based methods of word alignment,and puts forward the method of the noun phrase alignment of Korean-Chinese bilingual corpus based on the results of the analysis.This approach resorts to statistics to obtain information of word alignment position,based on which the word alignment correction is conducted from the similarity calculation in lexicon.Then the noun phrases and their Chinese translations are extracted from the rules of left and right boundaries of the Korean noun phrases,and the method of correlation measurement is applied to filter the noun phrases and realize their alignment.The experiments show that the proposed method has achieved satisfactory results of phrase alignment in the case of a large-scale corpus.

关 键 词:双语语料库 词对齐 短语对齐 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象