汉英机器翻译源语分析中词的识别被引量：4

Chinese Sentence Tokenization in a Chinese English M T System

作　　者：傅爱平[1]

出　　处：《中文信息学报》1999年第5期7-13,共7页Journal of Chinese Information Processing

摘　　要：汉英ＭＴ源语分析首先遇到的问题是词的识别。汉语中的“词”没有明确的定义，语素和词、词和词组、词组和句子，相互之间也没有清楚的界限。按照先分词、再句法分析的办法，会在分词时遇到构词问题和句法问题相互交错的困难。作者认为，可以把字作为源语句法分析的起始点，使词和词组的识别与句法分析同时进行。本文叙述了这种观点及其实现过程，并且以处理离合词为例，说明了识别的基本方法。The first problem we have metin source language analysis in a Chinese English M Tsystem is Chinese sentence tokenization ,as in written Chinese there is no explicit word delimiter . Finding tokenboundaries for a character string will be often interlaced with syntactic parsing ,or even with semantic re lations . This paper presents an approach of combination of sentence tokenization and syntactic semanticanalysis. Instead of getting tokenized word string before sentence parsing ,the tokenizing component isbuiltinto the parser ,i .e .syntactic and semantic information could be used for recognizing words whennecessary during parsing which is supported by a dictionary with descriptions for individual usage and aset of com mon rules .

关键词：机器翻译汉语自动分析汉语词的自动识别

分类号：H085[语言文字—语言学]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

汉英机器翻译源语分析中词的识别被引量：4

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

汉英机器翻译源语分析中词的识别 被引量：4

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

汉英机器翻译源语分析中词的识别被引量：4