基于二叉树的多分类SVM算法在电子邮件过滤中的应用  被引量:4

Application of multiclass SVM based on binary tree in E-mail filtering

在线阅读下载全文

作  者:衣治安[1] 刘杨[1] 

机构地区:[1]大庆石油学院计算机与信息技术学院,黑龙江大庆163318

出  处:《计算机应用》2007年第11期2860-2862,共3页journal of Computer Applications

基  金:黑龙江省研究生创新科研资金项目(YJSCX2006-38HLJ)

摘  要:目前性能较好的多分类算法有1-v-r支持向量机(SVM)、1-1-1SVM、DDAG SVM等,但存在大量不可分区域且训练时间较长的问题。提出一种基于二叉树的多分类SVM算法用于电子邮件的分类与过滤,通过构建二叉树将多分类转化为二值分类,算法采用先聚类再分类的思想,计算测试样本与子类中心的最大相似度和子类间的分离度,以构造决策节点的最优分类超平面。对于C类分类只需C-1个决策函数,从而可节省训练时间。实验表明,该算法得到了较高的查全率、查准率。Now some preferable performance multiclass algorithms, such as 1-v-r support vector machine (SVM) , 1-1-1 SVM and DDAG SVM, have many problems of impartible regions and longer training time. A new multiclass SVM algorithm based on binary tree was introduced on E-mail filtering. It could convert muhiclass problem to binary classification by constructing binary tree. The idea of clustering first and classification later was adopted, and the largest similarity between testing sample and sub-category center and the separation measure of sub-categories were calculated, in oder to construct the optimal class hyperplane of decision-making nodes. Only C - 1 optimal functions were needed for C kinds of classification, so training time could be saved, The experiment results show that the new algorithm has higher filtering recall and precision.

关 键 词:二叉树 多分类SVM 电子邮件过滤 聚类 

分 类 号:TP393.08[自动化与计算机技术—计算机应用技术] TP181[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象