检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:曹端喜 唐加山[2] 陈香 CAO Duan-xi;TANG Jia-shan;CHEN Xiang(School of Communication and Information Engineering,Nanjing University of Posts and Telecommunications;School of Science,Nanjing University of Posts and Telecommunications,Nanjing 210000,China)
机构地区:[1]南京邮电大学通信与信息工程学院 [2]南京邮电大学理学院,江苏南京210000
出 处:《软件导刊》2020年第7期28-31,共4页Software Guide
摘 要:K均值算法(K-Means)是聚类算法中最受欢迎且最健壮的一种算法,然而在实际应用中,存在真实数据集划分的类数无法提前确定及初始聚类中心点随机选择易使聚类结果陷入局部最优解的问题。因此提出一种基于最大距离中位数及误差平方和(SSE)的自适应改进算法。该算法根据计算获取初始聚类中心点,并通过SSE变化趋势决定终止聚类或继续簇的分裂,从而自动确定划分的类簇个数。采用UCI的4种数据集进行实验。结果表明,改进后的算法相比传统聚类算法在不增加迭代次数的情况下,聚类准确率分别提高了17.133%、22.416%、1.545%、0.238%,且聚类结果更加稳定。K-Means is one of the most popular and robust clustering algorithms.However,in practical applications,the number of classes divided by the real data set cannot be determined in advance and the random selection of the initial clustering center point easily leads to the problem that the clustering result falls into the local optimal solution.To this end,this paper proposes an adaptive and improved algorithm based on the maximum distance median and the sum of squared errors(SSE).The algorithm obtains the initial cluster center point through calculation,and decides to terminate the cluster or continue the division of the cluster based on the change trend of the SSE,so as to automatically determine the number of clusters to be divided.The results of experiments using four UCI data sets show that the improved algorithm improves the clustering accuracy by 17.133%,22.416%,1.545%,and 0.238%respectively without increasing the number of iterations compared to the traditional clustering algorithm,and the clustering results are more stable.
关 键 词:聚类算法 K-MEANS算法 初始聚类中心 自适应
分 类 号:TP301[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.189.189.4