一种层次初始的聚类个数自适应的聚类方法研究被引量：6

Research on a method of self-adaptation of the number of clusters for hierarchical initialization clustering

出　　处：《电子设计工程》2015年第6期5-8,共4页Electronic Design Engineering

基　　金：国家中医药管理局重点学科(中医药信息学)开放课题资助(ZYYXXX-13)

摘　　要：K均值聚类算法是一种常见且有效的基于划分的聚类算法。为解决该聚类算法对初始中心敏感的问题,常用的方法是层次化初始聚类中心。然而,层次初始的聚类算法仍然需要将聚类个数作为输入参数,在高维数据和海量数据中不易应用。基于能够自动确定聚类数目的目的,采用DBI度量,提出一种层次初始的聚类个数自适应的聚类方法(简称DHIKM)。通过UCI数据集和仿真数据上的实验,证明DHIKM可以在采样数据中快速找到合适的聚类个数,实验结果表明该算法在聚类质量与收敛速度上的有效性。K-means algorithm is a common and effective clustering algorithm based on partition. To solve the problem of sensitivity of initial cluster centers, the most frequently used method is searching optimal initial cluster centers by hierarchically initializing. However, it also takes the number of clusters as the argument. It is so difficult to give the number of clusters for the high dimensional data and large volume data that the hierarchal initialization K-means cannot be directly applied. To address this problem, this paper proposes a Davies Bouldin Index（DBI） based hierarchical initialization K-means（DHIKM） algorithm through integrating DBI metric into hierarchical initialization K-means algorithm. By DBI metric, DHIKM can quickly determine the number clusters on sampled data. Experiments on UCI dataset and synthetic data demonstrate the effectiveness of the proposed algorithm.

关键词：K均值算法层次初始化戴维森堡丁指数初始聚类中心聚类个数

分类号：TP301[自动化与计算机技术—计算机系统结构]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

一种层次初始的聚类个数自适应的聚类方法研究被引量：6

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

一种层次初始的聚类个数自适应的聚类方法研究 被引量：6

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

一种层次初始的聚类个数自适应的聚类方法研究被引量：6