基于统计机器翻译的汉维词对齐研究被引量：4

RESEARCH ON STATISTICAL MACHINE TRANSLATION-BASED CHINESE-UYGHUR WORD ALIGNMENT

出　　处：《计算机应用与软件》2011年第4期57-59,90,共4页Computer Applications and Software

基　　金：国家自然科学基金(60663006);国家语委科研项目(MZ115-75)

摘　　要：描述了一个基于统计机器翻译的汉维词对齐系统。系统处理过程分为两个模块:预处理和词对齐。预处理过程包括汉文文本预处理和维吾尔文文本预处理,其中维吾尔文文本预处理过程为:首先将维吾尔文转换成拉丁维文,然后将拉丁维文中个别字符替换为无歧义的字符。词对齐实现过程:首先利用IBM Model1-3,然后结合Och等人提出的启发式的思路进行优化,构建基于统计机器翻译的汉维词对齐系统。实验结果表明此系统可行。This paper describes a Chinese-Uyghur word alignment system which is based on statistical machine translation.There are two models in processing procedure of the system：pre-process and word alignment.The pre-process includes Chinese text pre-process and Uyghur text pre-process,in it the pre-process procedure for Uyghur text is as follows：First the Uyghur is transferred to Latin-Uyghur,then the exceptional characters in Latin-Uyghur will be replaced by the unambiguous characters.The implementation process of word alignment is：First,IBM model 1-3 is used,then in combination with the heuristic theory by Och the optimisation is conducted,and a Chinese-Uyghur word alignment system based on statistical machine translation is constructed.Experimental results show that this system is feasible.

关键词：词对齐 IBM Model1-3 启发式优化

分类号：TP391.2[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于统计机器翻译的汉维词对齐研究被引量：4

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于统计机器翻译的汉维词对齐研究 被引量：4

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于统计机器翻译的汉维词对齐研究被引量：4