现代汉语语义资源用于短语歧义模式消歧研究  被引量:9

Research on the Application of a Chinese Semantic Knowledge Base in Chinese Phrase Disambiguation

在线阅读下载全文

作  者:王锦[1] 陈群秀[1] 

机构地区:[1]清华大学计算机科学与技术系,北京100084

出  处:《中文信息学报》2007年第5期80-86,共7页Journal of Chinese Information Processing

基  金:国家863高科技项目(2001AA114210)

摘  要:现代汉语存在着许多歧义短语结构,仅依靠句中词性标记无法获得词与词之间正确的搭配关系。本文研究了大量包含歧义的短语实例,分析了计算机处理汉语结构时面临的定界歧义和结构关系歧义问题,在已有短语结构规则的基础上归纳出了七种结构歧义模式,提出了分析歧义模式的关键是四种基本搭配信息的判断,并实现了基于语义知识和搭配知识的消歧算法。对887处短语进行排歧的实验结果表明,处理短语结构的正确率由82.30%上升到87.18%。There are a variety of phrase ambiguities in Chinese. It is difficult to determine the correct syntactic structure of Chinese sentences with only part-of-speech information. Based on the observation on ambiguous phrases, this paper at first analyzes the problems of determining ambiguous boundaries and ambiguous structural relations of Chinese phrases, points out seven types of phrase ambiguities, then concludes four types of collocation information which are vital for processing ambiguous phrases. A disambiguation algorithm using both semantic and collocation knowledge is proposed consequently. The experimental result on 887 ambiguous phrases shows that this algorithm increases the disambiguation accuracy from 82.3% to 87. 18%.

关 键 词:计算机应用 中文信息处理 现代汉语语义知识库 搭配词典 短语歧义排歧 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象