用于中文分词的组合型歧义消解算法  被引量:5

COMBINATORIAL WORD SENSES DISAMBIGUATION ALGORITHM FOR CHINESE WORD SEGMENTATION

在线阅读下载全文

作  者:袁鼎荣[1,2] 李新友[2] 邵延振[2] 

机构地区:[1]北京工业大学国际WIC研究院,北京100022 [2]广西师范大学计算机科学与信息工程学院,广西桂林541004

出  处:《计算机应用与软件》2011年第6期57-58,134,共3页Computer Applications and Software

基  金:国家自然科学基金重大研究计划培育项目(90718020);澳大利亚ARC项目(DP0667060)

摘  要:自动分词技术的瓶颈是切分歧义,切分歧义可分为交集型切分歧义和组合型切分歧义。以组合型歧义字段所在句子为研究对象,考察歧义字段不同切分方式所得结果与其前后搭配所得词在全文中的支持度,构造从合或从分切分支持度度量因子,依据该因子消除组合型歧义。通过样例说明和实验验证该方法可行并优于现有技术。The bottleneck of automatic word segmentation is to segment the ambiguity of word senses,which can be divided into crossing ambiguity and combinational ambiguity of the word senses.In this paper,we took the sentence including word section with combinational ambiguity as our research object,examined the support degree of the words composed of the segmented results of ambiguous word section derived from different segmentation methods and their co-occurrence words in the text,constructed the metric factor of support degree of segmentations either in compliance to composition or to separation,the combinational ambiguity of word senses is cleared up according to the factor.The feasibility of the method and its predominance over present techniques have been illustrated by the exemplar and attested by the experiment.

关 键 词:中文信息处理 组合型歧义 共现支持度 歧义消解 支持度因子 

分 类 号:TP391.12[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象