基于错误驱动学习策略的藏语句法功能组块边界识别被引量：7

Tibetan Chunking Based on Error-Driven Learning Strategy

作　　者：王天航[1] 史树敏[1,2] 龙从军[3] 黄河燕[1,2] 李琳[3]

机构地区：[1]北京理工大学计算机学院,北京100081 [2]北京市海量语言信息处理与云计算应用工程技术研究中心,北京100081 [3]中国社会科学院民族学与人类学研究所,北京100081

出　　处：《中文信息学报》2014年第5期170-175,191,共7页Journal of Chinese Information Processing

基　　金：国家自然科学基金(61201352;61132009);国家重点基础研究发展规划(973)(2013CB329303);北京理工大学基础研究基金(20130742010)

摘　　要：藏语句法功能组块分析旨在识别出藏语句子的句法成分,为后续句子级深入分析提供支持。根据藏语的语言特点,该文在藏语句法功能组块描述体系基础上,提出基于错误驱动学习策略的藏语功能组块边界识别方法。具体思路为,首先基于条件随机场(Conditional Random Fields,CRFs)识别组块,然后分别基于转换规则的错误驱动学习(Transformation-based Error-driven Learning,TBL)及基于新特征模板的CRFs错误驱动学习进行二次识别,并对初次结果进行校正,F值分别提高了1.65%、8.36%。最后通过实验分析,进一步将两种错误驱动学习机制融合,在18 073词级的藏语语料上开展实验,识别性能进一步提高,准确率、召回率与F值分别达到94.1%、94.76%与94.43%,充分验证了本文提出方法的有效性。Tibetan chunking is aimed at identifying syntactic constituent in Tibetan sentences to facilitate further analysis of sentences. According to the unique characteristics o{ Tibetan, the paper puts forward an error-driven learning strategy to identify the chunk boundary based on the description system of Tibetan syntactic functional chunk. The specific idea is as follows： we recognize the chunk boundary using the Conditional Random Fields （CRFs） model at first. Then the recognition result is refined through Transformatiowbased Error-driven Learning （TBL） method and the CRFs error-driven method. The F values of both methods increase 1.65% and 8.36%, respectively. Finally we combine these two error-driven techniques. In the experiment of the Tibetan corpus which contains 18073 words, the precision, recall and F value achieves 94. 1% ,94.76% and 94.43%, respectively.

关键词：错误驱动学习藏语句法功能组块组块边界识别 CRFS TBL

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于错误驱动学习策略的藏语句法功能组块边界识别被引量：7

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于错误驱动学习策略的藏语句法功能组块边界识别 被引量：7

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于错误驱动学习策略的藏语句法功能组块边界识别被引量：7