基于多球分裂的增量式k-means聚类算法被引量：4

Incremental k-means clustering algorithm based on multi-sphere splitting

作　　者：曲福恒[1] 钱超越杨勇[1,2] 陆洋宋剑飞胡雅婷 QU Fu-heng;QIAN Chao-yue;YANG Yong;LU Yang;SONG Jian-fei;HU Ya-ting(College of Computer Science and Technology,Changchun University of Science and Technology,Changchun 130022,China;College of Education,Changchun Normal University,Changchun 130032,China;College of Information Technology,Jilin Agricultural University,Changchun 130118,China)

机构地区：[1]长春理工大学计算机科学技术学院,长春130022 [2]长春师范大学教育学院,长春130032 [3]吉林农业大学信息技术学院,长春130118

出　　处：《吉林大学学报（工学版）》2022年第6期1434-1441,共8页Journal of Jilin University:Engineering and Technology Edition

基　　金：吉林省教育厅“十三五”科学技术项目(JJKH20181164KJ);吉林省教育厅科学技术研究项目(JJKH20220330KJ);国家自然科学基金项目(41671397)。

摘　　要：针对Ball k-means(BKM)算法对初始化中心敏感、求解精度不高的问题,提出了一种基于多球分裂的增量式k-means聚类算法。该算法利用BKM需要记录各球聚类半径的特点,从给定的初始聚类中心个数开始,按照固定步长一次性产生多个增量聚类中心,直至得到k个初始中心。本文算法避免了所有n个s维数据之间的距离计算,在不增加空间复杂度的前提下将传统增量聚类中心选取的计算量由O(n^(2)s)降低至O(k log k)(k<n)。在8组UCI数据上与k-means、BKM、IK-+以及3个典型增量聚类算法的对比实验结果表明,本文算法降低了BKM的初始化敏感性,解的精度提升了37.15%~66.92%,是对比增量算法中唯一能够提升k-means效率的算法,且随着聚类个数的增加,效率优势更加明显。To address the problem that the Ball k-means(BKM)algorithm is sensitive to the initialized centers and the solution accuracy is not high,an incremental k-means clustering algorithm based on multiball splitting is proposed.The algorithm takes advantage of the feature that BKM needs to record the clustering radius of each ball,starts from a given number of initial clustering centers,and generates multiple incremental clustering centers at once according to a fixed step until k initial centers are obtained.The algorithm avoids the calculation of distances between all n s-dimensional data and reduces the computation of traditional incremental clustering center selection from O(n^(2)s)to O(k log k)(k<n)without increasing the space complexity.The experimental results of comparison with k-means,BKM,IK-+and three typical incremental clustering algorithms on 8 UCI datasets show that the proposed algorithm reduces the initialization sensitivity of BKM and improves the solution accuracy by 37.15%~66.92%;it is the only algorithm among the comparative incremental algorithms that can improve the efficiency of k-means,and the efficiency advantage becomes more obvious as the number of clusters increases.

关键词：模式识别 K-MEANS 增量聚类 ball k-means 多球分裂

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于多球分裂的增量式k-means聚类算法被引量：4

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于多球分裂的增量式k-means聚类算法 被引量：4

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于多球分裂的增量式k-means聚类算法被引量：4