检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]大连海事大学交通运输管理学院,辽宁大连116026
出 处:《计算机应用研究》2017年第12期3564-3568,共5页Application Research of Computers
基 金:国家自然科学基金资助项目(71271034);辽宁省自然科学基金资助项目(2014025015);青年骨干教师基金资助项目(3132016045)
摘 要:针对主动学习中构造初始分类器难以选取代表性样本的问题,提出一种模糊核聚类采样算法。该算法首先通过聚类分析技术将样本集划分,然后分别在类簇中心和类簇边界区域选取样本进行标注,最后依此构造初始分类器。在该算法中,通过高斯核函数将原始样本空间中的点非线性变换到高维特征空间,以达到线性可聚的目的,并引入了一种基于局部密度的初始聚类中心选择方法,从而改善聚类效果。为了提高采样质量,结合划分后各类簇的样本个数设计了一种采样比例分配策略;同时,在采样结束阶段设计了一种后补采样策略,以确保采样个数达标。实验结果分析表明,所提算法可以有效地减少构造初始分类器所需的人工标注负担,并取得了较高的分类正确率。Since it is difficult to select representative samples for active learning when constructing the initial classifier,this paper proposed a sampling algorithm using kernel-based fuzzy clustering. This algorithm began with dividing the sample set via clustering analysis technology,then it extracted samples from regions near the center and the boundary of clusters respectively and labeled them. And in the final phase it constructed the initial classifier using these labeled samples. In this algorithm,it transformed the point in the original sample space into a high dimensional feature space by Gaussian kernel function with the aim of linear clustering,and it introduced an initial cluster center selection method based on local density to improve its cluster performance. In order to ameliorate its sampling quality,this paper designed a sampling proportion allocation strategy utilizing the number of samples of divided each cluster. At the end of sampling,it used a fallback sampling strategy to ensure that the number of samples was up to the standard. The experimental results have demonstrated that the proposed algorithm can effectively reduce the cost of labeling samples when constructing the initial classifier,and get a higher classification accuracy.
分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.145.49.32