检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]哈尔滨工业大学语言语音教育部-微软重点实验室,哈尔滨150001
出 处:《高技术通讯》2007年第1期15-20,共6页Chinese High Technology Letters
基 金:国家自然科学基金(60302021、60375019)和863计划(2004AA117010-08)资助项目.
摘 要:报告了依托宾州中文树库进行句法分析研究的最新进展。以著名的中心驱动模型为基础,首次在宾州中文树库5.0上进行了句法分析实验。同前人的工作相比,这次实验取得了更加成功的结果,极大缩小了中、英文句法分析的差距。在公共的测试集上对句法分析器的性能进行了评价,对于正确分词和词性标注的句子,句法分析的精确率和召回率分别达到85.89%和85.61%。介绍了模型的实现过程,并进一步分析了模型中决策表和基本名词短语(BNP)两个关键环节在句法分析器中所起到的作用。本文的工作对于研制实用化句法分析系统具有一定参考价值。This paper reports the new improvement of the work on parsing the Penn Chinese treebank (CTB), one of the most important technologies of Chinese information processing. The well-known head,driven model was applied to the new available CTB5.0 and the parsing experiment was performed for the first time. Compared with the previous work on CTB, the experiment achieved more promising result and greatly narrowed the performance gap between Chinese parsing and English parsing. The parser was evaluated on the standard test set with PARSEVAL metric. It performed with the precision of 85.89% and the recall rate of 85.61% on the sentences with gold segmentation and POS tagging. The construction of the parser was described, and the functions of the two important technologies that can significantly improve the parsing performance were analyzed. This work is referential to the development of Chinese parser for real applications.
关 键 词:中心驱动模型 宾州中文树库 句法分析 结构模式识别
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.112