汉语比较句识别研究被引量：16

Learning to Identify Chinese Comparative Sentences

出　　处：《中文信息学报》2008年第5期30-38,共9页Journal of Chinese Information Processing

基　　金：国家863计划资助项目(2008AA01Z421);国家自然科学基金资助项目(60703064);教育部高等学校博士点新教师基金资助项目(20070001059)

摘　　要：比较是常见的表达方式,提取事物之间的比较关系是一项新颖而有实用价值的研究。识别自然语言中的比较句,是提取比较关系的一个重要步骤。目前还没有针对汉语比较句的自动识别研究,语言学上比较句的哪些特征能够应用到自动识别上来是一个亟待研究的问题。该文讨论了汉语比较句的范畴、外延和特征,定义了汉语比较句识别的任务,并提出用SVM分类器将汉语句子分为"比较"和"非比较"两类。该文比较了比较句的语言学特征和统计特征,包括特征词、序列模式等在分类中的作用。实验结果表明:基于类序列规则的SVM分类器能够有效地识别汉语比较句,效果优于传统基于词的文本分类。Comparison is a common kind of expression, and it is novel and substantial research to extract comparative relations between objects. Identifying comparative sentences in natural language is an important step in extracting comparative relations. To our knowledge, there is no research on identifying Chinese comparative sentences automatically. This paper first defines the problem of Chinese comparative sentence identification, and then proposes to use SVM to classify a Chinese sentence into either ＂comparative＂ or not. Various linguistic and statistical features have been explored, such as keywords and sequential patterns. Experimental results demonstrate the effectiveness of the sequential patterns, i.e. the classifier with sequential patterns can significantly outperform the traditional termbased classifier. We also empirically investigate the important factors that affect classification performance.

关键词：计算机应用中文信息处理汉语比较句识别比较挖掘文本分类序列模式

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

汉语比较句识别研究被引量：16

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

汉语比较句识别研究 被引量：16

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

汉语比较句识别研究被引量：16