基于非参数核密度估计的密度峰值聚类算法  被引量:7

Density peak clustering algorithm based on non-parametric kernel density estimation

在线阅读下载全文

作  者:谢国伟 钱雪忠[1] 周世兵[1] Xie Guowei;Qian Xuezhong;Zhou Shibing(School of Internet of Things Engineering,Jiangnan University,Wuxi Jiangsu 214122,China)

机构地区:[1]江南大学物联网工程学院,江苏无锡214122

出  处:《计算机应用研究》2018年第10期2956-2959,共4页Application Research of Computers

基  金:国家自然科学基金资助项目(61673193);中央高校基本科研业务费专项资金资助项目(JUSRP11235;JUSRP51635B)

摘  要:针对密度峰值聚类算法CFSFDP(clustering by fast search and find of density peaks)计算密度时人为判断截断距离和人工截取簇类中心的缺陷,提出了一种基于非参数核密度估计的密度峰值聚类算法。首先,应用非参数核密度估计方法计算数据点的局部密度;其次,根据排序图采用簇中心点自动选择策略确定潜在簇类中心点,将其余数据点归并到相应的簇类中心;最后,依据簇类间的合并准则对邻近相似子簇进行合并,并根据边界密度识别噪声点得到聚类结果。在人工测试数据集和UCI真实数据集上的实验表明,新算法较之原CFSFDP算法,不仅有效避免了人为判断截断距离和截取簇类中心的主观因素,而且可以取得更高的准确度。In view of the problem that the density peak clustering algorithm CFSFDP(clustering by fast search and find of density peaks)cannot determine the cutoff distance and the interception of the cluster center,this paper put forward an improved algorithm based on non-parametric kernel density estimation.Firstly,this algorithm used the non-parametric kernel density estimation method to calculate the local density of the data points.Secondly,according to the sorted map,it used the cluster center point automatic selection strategy to determine the cluster center point,then merged the rest of the data points to the corresponding cluster center.Finally,it merged the neighboring similar clusters based on the merging criterion between the clusters,and identified the noise points according to the boundary density.Experiments on the manual test data set and the UCI real data set show that the new algorithm can not only determine the cluster center automatically,but also get higher accuracy than the original CFSFDP algorithm.

关 键 词:聚类 密度峰值 非参数核密度估计 截断距离 

分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象