中文问句分类特征的研究  被引量:8

STUDY ON CLASSIFICATION FEATURES OF CHINESE INTERROGATIVES

在线阅读下载全文

作  者:牛彦清[1] 陈俊杰[1] 段利国[1] 张巍[1] 

机构地区:[1]太原理工大学计算机科学与技术学院,山西太原030024

出  处:《计算机应用与软件》2012年第3期108-111,共4页Computer Applications and Software

基  金:国家自然科学基金项目(60970059);山西省国际科技合作计划(2009081022)

摘  要:针对"不同的问句分类特征对问句分类的影响不相同,提取和处理这些特征的时间复杂度也不相同"的问题,提取问题疑问词、核心关键词(疑问词的一二级依存词和问句中心语)的主要义原、核心关键词的首义原、问句主谓宾的主要义原、命名实体、名词单(复)数等六种分类特征,采用支持向量机分类算法,对事实疑问句进行不同特征组合的分类对比实验,发现采用词义消岐技术提取的主要义原不仅对分类的准确率影响明显,而且大幅降低特征向量的维数,减少了处理时间。Different classification features of interrogatives differ in their impacts on interrogative classification,and the time complexities of the extraction and treatment of these features are dissimilar as well.To address this issue,in this article we propose such a method,it picks up six classification features,including the interrogative word of the question,the main sememe of core keyword(the first and second-level dependent word of the interrogative word and the kernel word of interrogatives),the first sememe of core keyword,the main sememe of interrogatives' subject-predicate-accusative words,the named entities and the singular/plural form of nouns,it adopts the classification algorithm of support vector machine to carry out the classification contrast experiments on fact interrogatives with different feature combinations.It is found that the main sememe of core keyword extracted by the word sense disambiguation technology not only impacts the accuracy rate of classification evidently but also greatly reduces the dimensions of eigenvector and the processing time.

关 键 词:问题分类 主要义原 词义消岐 支持向量机 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象