检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:胡国全[1] 陈家骏[1] 戴新宇[1] 尹存燕[1]
机构地区:[1]南京大学计算机软件新技术国家重点实验室,江苏南京210093
出 处:《计算机工程与设计》2005年第4期900-903,906,共5页Computer Engineering and Design
基 金:国家863高技术研究发展基金项目(2001AA117010)
摘 要:介绍了一种基于实例的汉英机器翻译策略,重点讨论了汉英双语语料库的设计和基于该语料库的汉语句子的匹配算法。在进行汉语句子的匹配时,根据汉语的特点直接采用汉字的匹配,而没有进行汉语句子的分词。另外,匹配时确定匹配片断的边界也是基于实例机器翻译的难点之一,在这方面也采取了相应的解决方法。没有对翻译句子的连接装配进行更深入的研究,这是因为该翻译策略是用于多翻译引擎系统的,它要与其它翻译策略配合使用,以提高翻译结果的正确率。基于实例的机器翻译需要大量的双语语料库作为翻译时的依据,而人工建设大型语料库费时费力,所以尝试采用计算机进行汉英双语语料库的自动建立,包括篇章对齐和单词级的对齐。A Chinese-English machine translation strategy is presented based on EBMT (Example-based machine translation) technique. EBMT systems have two main difficult issues: determining fragment's boundary in matching process and establishing bilingual-corpus. When Chinese being processed, words are not analyzed. Some statistical methods are used to align sentences and words, for example, using co-occurrence frequency. By considering the characteristics of Chinese, two Chinese sentences are matched in terms of Chinese characters. About boundary determination, an appropriate measure to solve it. Assembling matching fragments have not been studied. This translation strategy is meant to be used as one of the engines in a multi-engine translation system. It is a very difficulttask to construct a big bilingual-corpus manually, so computer is tried to use to process it automatically. It includes automatic alignment of bilingual sentences and words.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.249