基于网格和密度的k-means聚类算法  被引量:1

K-means Clustering Algorithm Based on Grid and Density

在线阅读下载全文

作  者:李永定 LI Yongding(Lanzhou Jiaotong university,Lanzhou 730070,China)

机构地区:[1]兰州交通大学交通运输学院

出  处:《洛阳理工学院学报(自然科学版)》2019年第4期48-54,共7页Journal of Luoyang Institute of Science and Technology:Natural Science Edition

摘  要:k-means聚类算法中,初始聚类中心的选取与数据中的离群点都对算法的结果有着非常大的影响。针对这一问题,提出一种基于网格和密度的k-means聚类算法GD-k-means,该算法首先将数据集映射到网格上形成网格簇进行初步聚类,利用密度阈值将网格分为低密度网格簇和高密度网格簇,在高密度网格簇中选取初始聚类中心,并利用传统的k-means算法进行迭代,通过评价条件判定是否需要进行网格簇的合并。聚类完成之后按照距离最近的原则对低密度网格簇中的数据进行相应的分配。实验结果表明:GD-k-means算法聚类结果更稳定,并且能够抵抗噪音数据的干扰。In K-means clustering algorithm, the selection of initial clustering centers and outliers in data has a great impact on the results of the algorithm. To solve this problem, a K-means clustering algorithm GD-K-means based on grid and density is proposed. Firstly, the data set is mapped onto the grid to form a grid cluster for preliminary clustering. Using density threshold, the grid is divided into low-density grid cluster and high-density grid cluster, and the initial clustering center is selected in high-density grid cluster. The traditional K-means algorithm is used to iterate, and the evaluation conditions are used to determine whether the merging of grid clusters is necessary. After clustering, the data in low density grid clusters are allocated according to the nearest principle. Experiments show that the clustering results of GD-K-means algorithm are more stable and can resist the interference of noise data.

关 键 词:K-MEANS算法 网格簇 密度 聚类数目 层次聚类 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象