面向主动学习的模糊核聚类采样算法  被引量:1

Sampling algorithm using kernel-based fuzzy clustering for active learning

在线阅读下载全文

作  者:王勇臻 陈燕[1] 张金松[1] 

机构地区:[1]大连海事大学交通运输管理学院,辽宁大连116026

出  处:《计算机应用研究》2017年第12期3564-3568,共5页Application Research of Computers

基  金:国家自然科学基金资助项目(71271034);辽宁省自然科学基金资助项目(2014025015);青年骨干教师基金资助项目(3132016045)

摘  要:针对主动学习中构造初始分类器难以选取代表性样本的问题,提出一种模糊核聚类采样算法。该算法首先通过聚类分析技术将样本集划分,然后分别在类簇中心和类簇边界区域选取样本进行标注,最后依此构造初始分类器。在该算法中,通过高斯核函数将原始样本空间中的点非线性变换到高维特征空间,以达到线性可聚的目的,并引入了一种基于局部密度的初始聚类中心选择方法,从而改善聚类效果。为了提高采样质量,结合划分后各类簇的样本个数设计了一种采样比例分配策略;同时,在采样结束阶段设计了一种后补采样策略,以确保采样个数达标。实验结果分析表明,所提算法可以有效地减少构造初始分类器所需的人工标注负担,并取得了较高的分类正确率。Since it is difficult to select representative samples for active learning when constructing the initial classifier,this paper proposed a sampling algorithm using kernel-based fuzzy clustering. This algorithm began with dividing the sample set via clustering analysis technology,then it extracted samples from regions near the center and the boundary of clusters respectively and labeled them. And in the final phase it constructed the initial classifier using these labeled samples. In this algorithm,it transformed the point in the original sample space into a high dimensional feature space by Gaussian kernel function with the aim of linear clustering,and it introduced an initial cluster center selection method based on local density to improve its cluster performance. In order to ameliorate its sampling quality,this paper designed a sampling proportion allocation strategy utilizing the number of samples of divided each cluster. At the end of sampling,it used a fallback sampling strategy to ensure that the number of samples was up to the standard. The experimental results have demonstrated that the proposed algorithm can effectively reduce the cost of labeling samples when constructing the initial classifier,and get a higher classification accuracy.

关 键 词:高斯核函数 聚类分析 采样 主动学习 分类 

分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象