检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]贵州师范学院数学与计算机科学学院,贵州贵阳550018 [2]贵州省高校工业物联网工程技术研究中心,贵州贵阳550018
出 处:《兰州理工大学学报》2015年第4期104-108,共5页Journal of Lanzhou University of Technology
基 金:贵州省优秀科技教育人才省长专项资金(黔省专合字(2012)82);贵阳市科技计划项目(筑科合同[2013101]10-6号)
摘 要:针对短文本特征稀疏导致的信息表示能力不足,提出基于同义词词林扩展的短文本分类方法.该方法首先利用同义词词林确定短文本中主干词的同义关系,引入大规模词语搭配资源实现无指导多义词义项判别,从而确定候选扩展特征,最后计算候选扩展特征与给定上下文的语义关联性,将满足条件的候选特征扩展到特征向量中.实验结果表明,该方法综合考虑的因素较全面,能够有效改善短文本的分类性能.Aimed at the deficit of information expression ability caused by sparseness of short text feature, a method of short text classification is proposed based on synonymy expansion. In this method, a synonymy is employed to determine synonymous relation of main word in short text and large-scale word collocation resources are introduced to realize discrimination of unsupervised polysemous word so as to make the candidate expansion characteristics determined, Finally, by means of calculating the semantic relevance of candidate expansion feature to a given context, the candidate features meeting the conditions will be extended to the feature vector. The experimental result shows that in this method, more overall factors are taken comprehensively into account, so that a higher classification performance can be achieved.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.3