基于内聚度和耦合度的二分K均值方法被引量：4

Bisecting K-means Clustering Method Based on Cohesion and Coupling

作　　者：郁湧[1,2] 康庆怡陈长赓阚世林骆永军 YU Yong1,2 ,KANG Qing -yi1, CHEN Chang -geng1,KAN Shi- lin1, LUO Yong- jun(2School of Software, Yunnan University ,Kunming G50504 ,China;2Key Laboratory for Software Engineering of Yunnan Province,Kunming 650504,Chin)

机构地区：[1]云南大学软件学院,昆明650504 [2]云南省软件工程重点实验室,昆明650504

出　　处：《计算机科学》2018年第B06期460-464,共5页Computer Science

基　　金：国家自然科学基金项目(61462091);云南大学数据驱动的软件工程省科技创新团队项目(2017HC012)资助

摘　　要：聚类分析是数据挖掘中最重要的技术之一,它在社会经济的各个领域都具有重要作用,并被广泛应用。K均值算法是最经典、应用最广泛的聚类方法之一,但其缺点是过度依赖初始条件和聚类数目难以确定,这制约了其应用范围。引入簇的内聚度和耦合度的定义与度量方法,基于"高内聚低耦合"的原理,在二分K均值聚类过程中对得到的簇进行不断的分裂和合并,并判断聚类结果是否满足要求以确定聚类的次数和簇的个数,从而实现对二分K均值聚类过程的改进。在Iris数据集上的实验测试与分析表明该算法不仅更加稳定,而且其聚类结果的正确率也较高。Clustering analysis is one of the most important techniques in data mining.It has important role and wide application in every field of social economy.K-means is one kind of the simple and widely used clustering methods,but its disadvantage is that it depends on the initial conditions and the number of clusters is difficult to determine.This paper introduced the cohesion and coupling of cluster,and presented the measurement of cohesion and coupling.Based on the principle of＂high cohesion and low coupling＂,the clusters are constantly divided and merged in the process of bisecting K-Means clustering algorithm.By judging whether the clustering results meet the requirements,it can determine the number of clusters,thus improving the bisecting K-Means clustering algorithm.The experimental results on Iris data show that the algorithm is not only more stable,but also has higher clustering accuracy.

关键词：聚类二分k均值内聚度耦合度

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于内聚度和耦合度的二分K均值方法被引量：4

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于内聚度和耦合度的二分K均值方法 被引量：4

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于内聚度和耦合度的二分K均值方法被引量：4