检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]哈尔滨理工大学计算机科学与技术学院,哈尔滨150080
出 处:《计算机应用研究》2015年第12期3590-3595,共6页Application Research of Computers
基 金:黑龙江省教育厅科学技术研究项目(12511100)
摘 要:针对传统K-means聚类算法对初始中心点比较敏感、易陷入局部最优,首先提出基于KD-树的初始聚类中心点选取方法。该方法通过建立KD-树将数据集分割成矩形单元,计算每个矩形的矩形单元中心、矩形单元密度,并将计算所得矩形单元密度降序排列,通过选取前k个矩形单元中心作为初始聚类中心,可有效克服传统算法对初始中心点的敏感。此外,针对传统K-means聚类算法不能有效处理动态数据聚类的问题,进一步提出了KDTK-means聚类算法。该算法对基于KD-树优化选取的k个聚类中心和增量数据建立新的KD-树,利用近邻搜索策略将增量数据分配到相应的聚类簇中并完成聚类。实验结果表明,与传统的K-means聚类算法相比,提出的基于KD-树优化初始聚类中心点选取的算法能够有效选取具有代表性的初始中心,提出的KDTKmeans聚类算法能够快速高效地处理增量数据聚类问题。The traditional K-means algorithm is sensitive to the initial center and easy to trap in local optimums. For overcoming this disadvantages, this paper proposed a new method based on KD-tree. The new method firstly divided the data into a series rectangular units by using KD-tree,and sorted the rectangular units by the density,then chose the k data objects with high density as the initial clustering centers. The experimental result shows that the proposed method has the weak dependence on initial data and better quality of clustering. Meanwhile, since the traditional K-means algorithm can not effectively organize the dynamic clustering,it proposed a new improved algorithm called KDTK-means algorithm. The KDTK-means algorithm built a new KD-tree by the incremental data and the optimized k initial centers, and then assigned each incremental data to corresponding cluster by the strategy of nearest neighbor searching. The experimental results and analysis show that the KDTK-means algorithm can organize the incremental clustering efficiently and has high clustering quality.
关 键 词:K-MEANS聚类 KD-树 增量聚类 初始聚类中心
分 类 号:TP391.3[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.119.102.182