基于反向最近邻的密度估计聚类算法  

Density Estimation Clustering Method Based on Reverse Nearest Neighbor

作  者:许梅梅 侯新民 XU Meimei;HOU Xinmin(School of Data Science,University of Science and Technology of China,Hefei 230026,China;School of Mathematical Sciences,University of Science and Technology of China,Hefei 230026,China;CAS Key Laboratory ofWu Wen-Tsun Mathematics,University of Science and Technology of China,Hefei 230026,China;Hefei National Laboratory,Hefei 230088,China)

机构地区:[1]中国科学技术大学大数据学院,合肥230026 [2]中国科学技术大学数学科学学院,合肥230026 [3]中国科学技术大学中国科学院吴文俊数学重点实验室,合肥230026 [4]合肥国家实验室,合肥230088

出  处:《计算机工程与应用》2025年第1期165-173,共9页Computer Engineering and Applications

基  金:国家自然科学基金(12071453);量子通信与量子计算机重大项目(2021ZD0302902)。

摘  要:基于相互最近邻的密度峰聚类算法(DenMune)通过相互最近邻计算数据点的局部密度,是一种有效的聚类手段。但该算法存在构建聚类骨架不合理的问题,在分配弱点时采用硬投票策略,易产生错误。因此提出一种新的基于反向最近邻的密度估计聚类算法(RNN-DEC)。该算法引入反向最近邻来计算数据点的局部密度,将数据点分成强点、弱点和噪声点。使用强点构建聚类算法的骨架,通过软投票的方式将弱点分配到与其相似度最高的簇中去。提出了一种基于反向最近邻的簇融合算法,将相似度高的子簇融合,得到最终的聚类结果。实验结果表明,在一些合成数据集和UCI真实数据集上,相比较于其他经典算法,该算法具有更好的聚类效果。The density peak clustering algorithm based on mutual nearest neighbors(DenMune)is an effective clustering method that calculates the local density of data points through mutual nearest neighbors.However,this algorithm has the problem of unreasonable construction of clustering skeletons,and using a hard voting strategy when allocating weaknesses can easily lead to errors.Therefore,a new density estimation clustering algorithm based on reverse nearest neighbor(RNN-DEC)is proposed.This algorithm introduces reverse nearest neighbors to calculate the local density of data points,dividing them into strong points,weak points,and noisy points.It builds the skeleton of the clustering algorithm using strong points and assigns weaknesses to the cluster with the highest similarity through soft voting.Finally,a cluster fusion algorithm based on reverse nearest neighbor is proposed,which fuses high similarity sub clusters to obtain the final clustering result.The experimental results show that compared to other classical algorithms,this algorithm has better clustering performance on some synthetic datasets and UCI real datasets.

关 键 词:反向最近邻 局部密度 密度聚类算法 子簇融合 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象