中英可比语料库中翻译等价对抽取方法研究被引量：9

Research on extraction of translation equivalents from Chinese-English comparable corpus

出　　处：《计算机工程与应用》2007年第32期44-46,71,共4页Computer Engineering and Applications

基　　金：国家自然科学基金(No.60572132);2005国科金外资助(No.60520130297)。~~

摘　　要：回顾了语料库分类及可比语料库中翻译等价对抽取方法研究的历史。根据从可比语料库中提取翻译等价对所依据的基本假设:一个语言中一个词在对应到另外一种语言时其与周围词之间的共现搭配关系仍然被保持,采用双向等价对获取计算然后求交集、词加权因数TF(iw)*IDF(i)值计算、上下文词的词性信息利用的方法来提高翻译等价对提取正确率。描述了翻译等价对抽取实验步骤,并对实验结果进行了简要分析。实验结果表明上述方法可以有效提高翻译等价对计算结果的正确率。最后提出了需要进一研究的问题。This paper reviews the classification of corpora and the history of the research on extraction of translation equivalents from comparable corpus.Based on the basic hypothesis of the extraction of translation equivalents from comparable corpus（namely, there exists a correlation between the context distribution of words which are the translation of each other）,this paper adopts the following methods to improve the accuracy of candidates of translation equivalents extracted from comparable corpus：To compute the intersection after the bidirectional extraction of translation equivalents;to calculate the word weight factor TF（iw）＊IDF（i）,and to utilize the POS information of words in the context.This paper describes the various steps in the experiment of the extraction of translation equivalents from comparable corpus,and conducts analysis on the results from the experiment.The results show that the above methods can improve the accuracy of candidates of translation equivalents extracted from comparable corpus.To round up, the paper puts forward issues required for further research.

关键词：可比语料库翻译等价对抽取上下文向量向量相似度计算

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

中英可比语料库中翻译等价对抽取方法研究被引量：9

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

中英可比语料库中翻译等价对抽取方法研究 被引量：9

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

中英可比语料库中翻译等价对抽取方法研究被引量：9