一种优化初始聚类中心的自适应聚类算法  被引量:4

An Adaptive Clustering Algorithm by Optimizing Initial Clustering Centers

在线阅读下载全文

作  者:曹端喜 唐加山[2] 陈香 CAO Duan-xi;TANG Jia-shan;CHEN Xiang(School of Communication and Information Engineering,Nanjing University of Posts and Telecommunications;School of Science,Nanjing University of Posts and Telecommunications,Nanjing 210000,China)

机构地区:[1]南京邮电大学通信与信息工程学院 [2]南京邮电大学理学院,江苏南京210000

出  处:《软件导刊》2020年第7期28-31,共4页Software Guide

摘  要:K均值算法(K-Means)是聚类算法中最受欢迎且最健壮的一种算法,然而在实际应用中,存在真实数据集划分的类数无法提前确定及初始聚类中心点随机选择易使聚类结果陷入局部最优解的问题。因此提出一种基于最大距离中位数及误差平方和(SSE)的自适应改进算法。该算法根据计算获取初始聚类中心点,并通过SSE变化趋势决定终止聚类或继续簇的分裂,从而自动确定划分的类簇个数。采用UCI的4种数据集进行实验。结果表明,改进后的算法相比传统聚类算法在不增加迭代次数的情况下,聚类准确率分别提高了17.133%、22.416%、1.545%、0.238%,且聚类结果更加稳定。K-Means is one of the most popular and robust clustering algorithms.However,in practical applications,the number of classes divided by the real data set cannot be determined in advance and the random selection of the initial clustering center point easily leads to the problem that the clustering result falls into the local optimal solution.To this end,this paper proposes an adaptive and improved algorithm based on the maximum distance median and the sum of squared errors(SSE).The algorithm obtains the initial cluster center point through calculation,and decides to terminate the cluster or continue the division of the cluster based on the change trend of the SSE,so as to automatically determine the number of clusters to be divided.The results of experiments using four UCI data sets show that the improved algorithm improves the clustering accuracy by 17.133%,22.416%,1.545%,and 0.238%respectively without increasing the number of iterations compared to the traditional clustering algorithm,and the clustering results are more stable.

关 键 词:聚类算法 K-MEANS算法 初始聚类中心 自适应 

分 类 号:TP301[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象