基于聚类分析的改进堆叠算法  被引量:1

An Improved Stacking Algorithm Based on Cluster Analysis

在线阅读下载全文

作  者:胡小生[1] 张润晶[2] 钟勇[1] 

机构地区:[1]佛山科学技术学院电子与信息工程学院,佛山528000 [2]佛山科学技术学院信息与教育技术中心,佛山528000

出  处:《计算机与数字工程》2013年第11期1725-1728,共4页Computer & Digital Engineering

基  金:佛山市科技发展专项资金项目(编号:2011AA100061);佛山市产学研专项资金项目(编号:2012HC100272);佛山市智能教育评价指标体系研究项目(编号:DX20120220)资助

摘  要:在基于Stacking框架下异构分类器集成的元学习基础上,将无监督的聚类应用到分类过程中,提出一种基于聚类分析的改进Stacking集成算法。训练样本首先被基分类器分类,随后分类结果被聚类成多个簇,以便分类结果相一致的样本能够被聚集至同一个簇中,同时,将样本特征属性也应用到聚类过程中以增强聚类效果,在每个聚簇内应用C4.5决策树算法提炼决策边界;在分类阶段,首先找出与待分类样本距离最近的聚簇,之后用此聚簇的决策树模型进行分类。实验结果表明,该算法在分类准确性方面有明显优势。On the Stacking framework to construct heterogeneous ensemble meta-learning, a modified version of Stacking based on clus- ter analysis was proposed, applying unsupervised K-means clustering to classification process. Instances from training set are firstly classified by all base classifiers, the classified results are then grouped into a number of clusters, which means that one cluster should contain objects that were correctly/incorrectly classified to the same class by the same group of base classifiers. The algorithm apply the whole instance fea- tures in the clustering process to enhance clustering quality. Next, using C4. 5 algorithm on each cluster to build decision tree, the decision tree on each cluster refines the decision boundaries by learning the subgroups within the cluster. When classifying a new instance, the ap- proach attempts to find the cluster to which it is closest, then uses the decision model on each cluster to make a final decision. Experimental results show that the proposed method outperform individual classifiers, majority voting and classic Stacking method.

关 键 词:分类器集成 STACKING 聚类 元学习 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象