一种改进的搜索密度峰值的聚类算法  被引量:16

An improved clustering algorithm that searches and finds density peaks

在线阅读下载全文

作  者:淦文燕[1] 刘冲[1] 

机构地区:[1]解放军理工大学指挥信息系统学院,江苏南京210007

出  处:《智能系统学报》2017年第2期229-236,共8页CAAI Transactions on Intelligent Systems

基  金:国家自然科学基金项目(60974086)

摘  要:聚类是大数据分析与数据挖掘的基础问题。刊登在2014年《Science》杂志上的文章《Clustering by fast search and find of density peaks》提出一种快速搜索密度峰值的聚类算法,算法简单实用,但聚类结果依赖于参数dc的经验选择。论文提出一种改进的搜索密度峰值的聚类算法,引入密度估计熵自适应优化算法参数。对比实验结果表明,改进方法不仅可以较好地解决原算法的参数人为确定的不足,而且具有相对更好的聚类性能。Clustering is a fundamental issue for big data analysis and data mining. In July 2014, a paper in the Journal of Science proposed a simple yet effective clustering algorithm based on the idea that cluster centers are characterized by a higher density than their neighbors and having a relatively large distance from points with higher densities. The proposed algorithm can detect clusters of arbitrary shapes and differing densities but is very sensitive to tunable parameter dc . In this paper, we propose an improved clustering algorithm that adaptively optimizes pa- rameter de. The time complexity of our algorithm was super-linear with respect to the size of the dataset. Further, our theoretical analysis and experimental results show the effectiveness and efficiency of our improved algorithm.

关 键 词:数据挖掘 聚类算法 核密度估计  

分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象