从宾州中文树库观察三个汉语语法问题  被引量:1

Three Problems of Chinese Grammar Observed from the Penn Chinese Treebank

在线阅读下载全文

作  者:黄昌宁[1] 靳光瑾[2] 

机构地区:[1]清华大学计算机科学与技术系,北京100084 [2]教育部语言文字应用研究所,北京100010

出  处:《语言科学》2013年第2期178-192,共15页Linguistic Sciences

摘  要:树库是一种记录每个句子句法分析结果的标注语料库。文章介绍的是美国宾州大学构建的中文树库(CTB)。描写句子的谓词-论元结构是CTB标注的一个重要目标。因此,它在句法标注中刻意强调的是以下三个抽象的语法关系:中心语-补足语关系、中心语-附加语关系和并列关系。在CTB中每个短语节点所支配的括号对或子树只表示上述的一种语法关系。此外,CTB在语法体系上也有很多特点,文章仅选取补足语、汉语的标句词"的(DEC)"以及遵循语杠理论的词性标注准则等三个汉语语法问题来进行讨论。如果我们同意句子的谓词-论元结构描写是树库建设的一个重要目标,那么上述三个问题不仅同这个目标紧密关联,而且将影响到基于树库的自动词性标注和句法分析系统的性能及其后续应用的结果。sentence In It. Treebank is a kind of bracketed corpus which records the synta We will introduce the Penn Chinese Treebank (CTB) built by ctic parsing tree of each the University of Penn sylvania, USA. One of the important goals of the CTB annotation is to describe the predicate argu ment structures in sentences. During syntactic annotation, CTB intently focuses on the following three abstract grammatical relations: complementation, adjunction and coordination. Each of the a- bove grammatical relations is assigned a unique hierarchical structure. Although there are a number of characteristics in the grammar formalism of CTB, we only discuss the following three issues in this article: 1) Complement(补足语), 2) The Chinese complementizer "的 (DEC)", and 3) The criteria of part-of-speech (POS) tagging based on X-Bar Theory. If we also agree to the fact that the predicate argument description is one of the important goals of treebank construction, then the three issues above are closely related to the goal because they affect not only the performance of automatic POS tagging and par- sing systems trained on the treebank but also the results of those subsequent applications.

关 键 词:树库 谓词-论元结构 补足语 标句词 词性标注 

分 类 号:H17[语言文字—汉语] H146

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象