检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:曲福恒[1] 钱超越 杨勇[1,2] 陆洋 宋剑飞 胡雅婷 QU Fu-heng;QIAN Chao-yue;YANG Yong;LU Yang;SONG Jian-fei;HU Ya-ting(College of Computer Science and Technology,Changchun University of Science and Technology,Changchun 130022,China;College of Education,Changchun Normal University,Changchun 130032,China;College of Information Technology,Jilin Agricultural University,Changchun 130118,China)
机构地区:[1]长春理工大学计算机科学技术学院,长春130022 [2]长春师范大学教育学院,长春130032 [3]吉林农业大学信息技术学院,长春130118
出 处:《吉林大学学报(工学版)》2022年第6期1434-1441,共8页Journal of Jilin University:Engineering and Technology Edition
基 金:吉林省教育厅“十三五”科学技术项目(JJKH20181164KJ);吉林省教育厅科学技术研究项目(JJKH20220330KJ);国家自然科学基金项目(41671397)。
摘 要:针对Ball k-means(BKM)算法对初始化中心敏感、求解精度不高的问题,提出了一种基于多球分裂的增量式k-means聚类算法。该算法利用BKM需要记录各球聚类半径的特点,从给定的初始聚类中心个数开始,按照固定步长一次性产生多个增量聚类中心,直至得到k个初始中心。本文算法避免了所有n个s维数据之间的距离计算,在不增加空间复杂度的前提下将传统增量聚类中心选取的计算量由O(n^(2)s)降低至O(k log k)(k<n)。在8组UCI数据上与k-means、BKM、IK-+以及3个典型增量聚类算法的对比实验结果表明,本文算法降低了BKM的初始化敏感性,解的精度提升了37.15%~66.92%,是对比增量算法中唯一能够提升k-means效率的算法,且随着聚类个数的增加,效率优势更加明显。To address the problem that the Ball k-means(BKM)algorithm is sensitive to the initialized centers and the solution accuracy is not high,an incremental k-means clustering algorithm based on multiball splitting is proposed.The algorithm takes advantage of the feature that BKM needs to record the clustering radius of each ball,starts from a given number of initial clustering centers,and generates multiple incremental clustering centers at once according to a fixed step until k initial centers are obtained.The algorithm avoids the calculation of distances between all n s-dimensional data and reduces the computation of traditional incremental clustering center selection from O(n^(2)s)to O(k log k)(k<n)without increasing the space complexity.The experimental results of comparison with k-means,BKM,IK-+and three typical incremental clustering algorithms on 8 UCI datasets show that the proposed algorithm reduces the initialization sensitivity of BKM and improves the solution accuracy by 37.15%~66.92%;it is the only algorithm among the comparative incremental algorithms that can improve the efficiency of k-means,and the efficiency advantage becomes more obvious as the number of clusters increases.
关 键 词:模式识别 K-MEANS 增量聚类 ball k-means 多球分裂
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7