检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:浦路平[1] 赵鹏大[1] 胡光道[1] 张振飞[1] 夏庆霖[1]
机构地区:[1]中国地质大学遥感地质与数学地质所
出 处:《计算机应用研究》2008年第5期1412-1414,共3页Application Research of Computers
基 金:国家自然科学基金资助项目(402721122);广西教育厅资助项目(桂教科研[2004]4号)
摘 要:提出了一种新的基于PCA和K-均值聚类的有监督二叉分裂层次聚类方法PCASHC,用K-均值聚类进行逐次二叉聚簇分裂,选择PCA第一主成分相距最远样本点作为K-均值聚类初始聚簇中心,解决了K-均值聚类初始中心随机选择导致结果不确定的问题,用聚簇样本类别方差作为聚簇样本不纯度控制聚簇分裂水平,避免过拟合,可学习到合适的聚类数目。用四组UCI标准数据集对其进行了10折交叉验证分类误差检验,与另外七种分类器相比说明PCASHC有较高的分类精度。The paper presented a new supervised bin-split hierarchy clustering method, PCASHC ( PCA split supervised hierarchy clustering), The method bin-splited cluster by K-means clustering with initial centers undertaken by the samples of maximum and minimum of first principal component of principal component analysis of the cluster, which solve the problem of uncertain result as a result of the uncertain choice of initial centers. In the method, the variance of the classes of the samples in cluster was chose as measure of impurity of cluster samples class, which controls the slip level of cluster, avoid over-fitting and can find out the proper number of clusters. The method tested with 10-fold cross validation for classifying of 4 UCI datasets. It proves the method has excellent classifying accuracy rate comparing of the error rate of it to other 7 representative classifiers for classifying of same datasets with same test way.
分 类 号:TP301[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.223.162.245