基于同义词词林扩展的短文本分类  被引量:9

Short text classification based on synonymy expansion

在线阅读下载全文

作  者:王东[1,2] 熊世桓[1,2] 

机构地区:[1]贵州师范学院数学与计算机科学学院,贵州贵阳550018 [2]贵州省高校工业物联网工程技术研究中心,贵州贵阳550018

出  处:《兰州理工大学学报》2015年第4期104-108,共5页Journal of Lanzhou University of Technology

基  金:贵州省优秀科技教育人才省长专项资金(黔省专合字(2012)82);贵阳市科技计划项目(筑科合同[2013101]10-6号)

摘  要:针对短文本特征稀疏导致的信息表示能力不足,提出基于同义词词林扩展的短文本分类方法.该方法首先利用同义词词林确定短文本中主干词的同义关系,引入大规模词语搭配资源实现无指导多义词义项判别,从而确定候选扩展特征,最后计算候选扩展特征与给定上下文的语义关联性,将满足条件的候选特征扩展到特征向量中.实验结果表明,该方法综合考虑的因素较全面,能够有效改善短文本的分类性能.Aimed at the deficit of information expression ability caused by sparseness of short text feature, a method of short text classification is proposed based on synonymy expansion. In this method, a synonymy is employed to determine synonymous relation of main word in short text and large-scale word collocation resources are introduced to realize discrimination of unsupervised polysemous word so as to make the candidate expansion characteristics determined, Finally, by means of calculating the semantic relevance of candidate expansion feature to a given context, the candidate features meeting the conditions will be extended to the feature vector. The experimental result shows that in this method, more overall factors are taken comprehensively into account, so that a higher classification performance can be achieved.

关 键 词:短文本分类 特征扩展 同义词词林 搭配词库 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象