面向变尺度密度数据的分级聚类算法  被引量:4

Hierarchical clustering algorithm for density-varying data

在线阅读下载全文

作  者:袁志琴 庄华亮 何熊熊[1] YUAN Zhiqin;ZHUANG Hualiang;HE Xiongxiong(College of Information Engineering,Zhejiang University of Techonology,Hangzhou Zhejiang 310023,China)

机构地区:[1]浙江工业大学信息工程学院,杭州310023

出  处:《计算机应用》2020年第S02期54-59,共6页journal of Computer Applications

基  金:国家自然科学基金资助项目(61873239)。

摘  要:针对传统的基于距离和密度的聚类算法存在的一些常见的问题,诸如不适用于密度多尺度变化的数据及非凸状数据聚类、聚类质量过于依赖参数、计算复杂度较高等,提出了一种基于区域生长及竞争的分级聚类算法。聚类过程分为三级:首先,第一级聚类基于欧氏距离,用距离阈值将对象划分为一定数目的小类来覆盖数据空间,同时降低算法复杂度;然后,第二级用空间数据区域生长的方法,用已获得簇心作为生长种子,在逐步放宽类半径准则的方法下进行生长,来解决变尺度数据密度聚类的问题;最后,第三级基于竞争的思想与密度相似性原则,计算簇心之间的权重,采取适当的规则进行簇的合并,来解决非凸状数据聚类的问题。实验结果表明,所提算法相较K-means及DBSCAN算法能在克服变尺度密度数据空间问题的基础上最大限度地提高聚类的准确度并缩短聚类时间。The traditional distance-based and density-based clustering algorithms have problems that are not suitable for clustering density-varying data and non-convex data besides their high sensitivity to parameters and computational complexity.A hierarchical clustering algorithm based on region growing and competition was proposed.The algorithm consists of three phases.In the first phase of the algorithm,the data space was covered by using multiple small clusters with a low-complexity computation.In the second phase,the method of spatial data region growing was applied by taking the obtained cluster centers as the seeds.The seed growing method gradually relaxed the criterion of cluster radius to solve the problem of clustering of density-varying data.In the third phase,the weights between cluster centers were calculated and a set of rules was adopted to merge clusters to solve the problem of non-convex data clustering based on the idea of competition and the principle of density similarity.The experimental results show that,the proposed method can significantly improve the clustering accuracy with competitive processing speed for density-varying data as compared with K-means and Density-Based Spatial Clustering of Applications with Noise(DBSCAN)algorithms.

关 键 词:分级聚类 变尺度密度数据 区域生长 关系权重 类合并 

分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象