一种新的中文文本分类算法——One Class SVM-KNN算法  被引量:4

A New Text Classification Algorithm——One Class SVM-KNN

在线阅读下载全文

作  者:刘文[1] 吴陈[1] 

机构地区:[1]江苏科技大学智能信息处理实验室,江苏镇江212003

出  处:《计算机技术与发展》2012年第5期83-86,共4页Computer Technology and Development

摘  要:中文文本分类在数据库及搜索引擎中得到广泛的应用,K-近邻(KNN)算法是常用于中文文本分类中的分类方法,但K-近邻在分类过程中需要存储所有的训练样本,并且直到待测样本需要分类时才建立分类,而且还存在类倾斜现象以及存储和计算的开销大等缺陷。单类SVM对只有一类的分类问题具有很好的效果,但不适用于多类分类问题,因此针对KNN存在的缺陷及单类SVM的特点提出One Class SVM-KNN算法,并给出了算法的定义及详细分析。通过实验证明此方法很好地克服了KNN算法的缺陷,并且查全率、查准率明显优于K-近邻算法。Text classification is widely used in database and search engine. KNN is widely used in Chinese text categorization,however, KNN has many defects in the application of text classification. The deficiency of KNN classification algorithm is that all the training sam- pies are kept until the samples are classified. When the size of samples is very large, the storage and computation will be costly, which will result in classification deviation. One class SVM is a simple and effective classification algorithm in one class. To solve KNN problems, a new algorithm based on harmonic one-class-SVM and KNN was proposed,which will achieve better classification effect. The experiment result is shown that the recall computed using the proposed method is obviously more highly than the KNN method.

关 键 词:中文文本分类 支持向量机 K-近邻 ONE CLASS SVM—KNN 

分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象