基于混合模型的交集型歧义消歧策略被引量：2

Hybrid model for overlapping ambiguities resolution

机构地区：[1]南京大学计算机软件新技术国家重点实验室,南京210093 [2]南京大学计算机科学与技术系,南京210093

出　　处：《计算机工程与应用》2008年第21期5-8,共4页Computer Engineering and Applications

基　　金：国家自然科学基金( the National Natural Science Foundation of China under Grant No.60673043) ;国家社会科学基金( No.07BYY051) ;国家高技术研究发展计划 ( 863)( the National High-Tech Research and Development Plan of China under Grant No.2006AA01Z143,No.2006AA01Z139)

摘　　要：针对交集型歧义这一汉语分词中的难点问题,提出了一种规则和统计相结合的交集型歧义消歧模型。首先,根据标注语料库,通过基于错误驱动的学习思想,获取交集型歧义消歧规则库,同时,利用统计工具,构建N-Gram统计语言模型;然后,采用正向/逆向最大匹配方法和消歧规则库探测发现交集型歧义字段;最后,通过消歧规则库和评分函数进行交集型歧义的消歧处理。这种基于混合模型的方法可以探测到更多的交集型歧义字段,并且结合了规则方法和统计方法在处理交集型歧义上的优势。实验表明,这种方法提高了交集型歧义处理的精度,为解决交集型歧义提供了一种新的思路。Overlapping ambiguity is one of the key problems in Chinese words segmentation.In this paper,a new hybrid strategy which integrates rule-based method and statistical-based method is presented for solving the overlapping ambiguity.Firstly,rule-set is constructed automatically through error-driven learning which will be used for some ambiguities detection and resolution. Secondly,a score function based on N-Gram language model is constructed.Lastly,a rule-based module and a statistical-based module will be combined for solving all ambiguities detected by FMM＆BMM and the rule-set.The experiments show that this hybrid method is more suitable for ambiguities detection and possesses the advantages of both rule-based and statistical-based methods for overlapping ambiguities resolution in Chinese words segmentation.

关键词：交集型歧义消歧规则统计语言模型评分函数全切分

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于混合模型的交集型歧义消歧策略被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于混合模型的交集型歧义消歧策略 被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于混合模型的交集型歧义消歧策略被引量：2