基于KD-树和K-means动态聚类方法研究  被引量:16

Dynamic clustering algorithm based on KD-tree and K-means method

在线阅读下载全文

作  者:万静[1] 张义[1] 何云斌[1] 李松[1] 

机构地区:[1]哈尔滨理工大学计算机科学与技术学院,哈尔滨150080

出  处:《计算机应用研究》2015年第12期3590-3595,共6页Application Research of Computers

基  金:黑龙江省教育厅科学技术研究项目(12511100)

摘  要:针对传统K-means聚类算法对初始中心点比较敏感、易陷入局部最优,首先提出基于KD-树的初始聚类中心点选取方法。该方法通过建立KD-树将数据集分割成矩形单元,计算每个矩形的矩形单元中心、矩形单元密度,并将计算所得矩形单元密度降序排列,通过选取前k个矩形单元中心作为初始聚类中心,可有效克服传统算法对初始中心点的敏感。此外,针对传统K-means聚类算法不能有效处理动态数据聚类的问题,进一步提出了KDTK-means聚类算法。该算法对基于KD-树优化选取的k个聚类中心和增量数据建立新的KD-树,利用近邻搜索策略将增量数据分配到相应的聚类簇中并完成聚类。实验结果表明,与传统的K-means聚类算法相比,提出的基于KD-树优化初始聚类中心点选取的算法能够有效选取具有代表性的初始中心,提出的KDTKmeans聚类算法能够快速高效地处理增量数据聚类问题。The traditional K-means algorithm is sensitive to the initial center and easy to trap in local optimums. For overcoming this disadvantages, this paper proposed a new method based on KD-tree. The new method firstly divided the data into a series rectangular units by using KD-tree,and sorted the rectangular units by the density,then chose the k data objects with high density as the initial clustering centers. The experimental result shows that the proposed method has the weak dependence on initial data and better quality of clustering. Meanwhile, since the traditional K-means algorithm can not effectively organize the dynamic clustering,it proposed a new improved algorithm called KDTK-means algorithm. The KDTK-means algorithm built a new KD-tree by the incremental data and the optimized k initial centers, and then assigned each incremental data to corresponding cluster by the strategy of nearest neighbor searching. The experimental results and analysis show that the KDTK-means algorithm can organize the incremental clustering efficiently and has high clustering quality.

关 键 词:K-MEANS聚类 KD-树 增量聚类 初始聚类中心 

分 类 号:TP391.3[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象