基于自适应近邻参数的密度峰聚类算法  被引量:1

Density peak clustering algorithm based on adaptive nearest neighbor parameters

在线阅读下载全文

作  者:周欢欢 郑伯川 张征[1] 张琦 ZHOU Huanhuan;ZHENG Bochuan;ZHANG Zheng;ZHANG Qi(School of Mathematics and Information,China West Normal University,Nanchong Sichuan 637009,China;School of Computer Science,China West Normal University,Nanchong Sichuan 637009,China)

机构地区:[1]西华师范大学数学与信息学院,四川南充637009 [2]西华师范大学计算机学院,四川南充637009

出  处:《计算机应用》2022年第5期1464-1471,共8页journal of Computer Applications

基  金:国家自然科学基金资助项目(62176217)。

摘  要:针对基于共享最近邻的密度峰聚类算法中的近邻参数需要人为设定的问题,提出了一种基于自适应近邻参数的密度峰聚类算法。首先,利用所提出的近邻参数搜索算法自动获得近邻参数;然后,通过决策图选取聚类中心;最后,根据所提出的代表点分配策略,先分配代表点,后分配非代表点,从而实现所有样本点的聚类。将所提出的算法与基于共享最近邻的快速密度峰搜索聚类(SNN-DPC)、基于密度峰值的聚类(DPC)、近邻传播聚类(AP)、对点排序来确定聚类结构(OPTICS)、基于密度的噪声应用空间聚类(DBSCAN)和K-means这6种算法在合成数据集以及UCI数据集上进行聚类结果对比。实验结果表明,所提出的算法在调整互信息(AMI)、调整兰德系数(ARI)和FM指数(FMI)等评价指标上整体优于其他6种算法。所提算法能自动获得有效的近邻参数,且能较好地分配簇边缘区域的样本点。Aiming at the problem that the nearest neighbor parameters need to be set manually in density peak clustering algorithm based on shared nearest neighbor,a density peak clustering algorithm based on adaptive nearest neighbor parameters was proposed.Firstly,the proposed nearest neighbor parameter search algorithm was used to automatically obtain the nearest neighbor parameters.Then,the clustering centers were selected through the decision diagram.Finally,according to the proposed allocation strategy of representative points,all sample points were clustered through allocating the representative points and the non-representative points sequentially.The clustering results of the proposed algorithm was compared with those of the six algorithms such as Shared-Nearest-Neighbor-based Clustering by fast search and find of Density Peaks(SNN-DPC),Clustering by fast search and find of Density Peaks(DPC),Affinity Propagation(AP),Ordering Points To Identify the Clustering Structure(OPTICS),Density-Based Spatial Clustering of Applications with Noise(DBSCAN),and K-means on the synthetic datasets and UCI datasets.Experimental results show that,the proposed algorithm is better than the other six algorithms on the evaluation indicators such as Adjusted Mutual Information(AMI),Adjusted Rand Index(ARI)and Fowlkes and Mallows Index(FMI).The proposed algorithm can automatically obtain the effective nearest neighbor parameters,and can better allocate the sample points in the edge region of the cluster.

关 键 词:共享最近邻 局部密度 密度峰聚类 K-近邻 逆近邻 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象