基于初始中心优化的遗传K-means聚类新算法  被引量:17

New genetic K-means clustering algorithm based on meliorated initial center

在线阅读下载全文

作  者:孙秀娟[1] 刘希玉[2] 

机构地区:[1]山东师范大学信息科学与工程学院,济南250014 [2]山东师范大学管理学院,济南250014

出  处:《计算机工程与应用》2008年第23期166-168,182,共4页Computer Engineering and Applications

基  金:山东省自然科学基金重大项目(No.Z2004G02);山东省中青年科学家奖励基金资助项目(No.03BS003);山东教育厅科技计划项目(No.J05G01);"泰山学者"建设工程专项经费资助~~

摘  要:一个好的K-means聚类算法至少要满足两个要求:(1)能反映聚类的有效性,即所分类别数要与实际问题相符;(2)具有处理噪声数据的能力。传统的K-means算法是一种局部搜索算法,存在着对初始化敏感和容易陷入局部极值的缺点。针对此缺点,提出了一种优化初始中心的K-means算法,该算法选择相距最远的处于高密度区域的k个数据对象作为初始聚类中心。实验表明该算法不仅具有对初始数据的弱依赖性,而且具有收敛快,聚类质量高的特点。为体现聚类的有效性,获得更高精度的聚类结果,提出了将优化的K-means算法(PKM)和遗传算法相结合的混合算法(PGKM),该算法在提高紧凑度(类内距)和分离度(类间距)的同时自动搜索最佳聚类数k,对k个初始中心优化后再聚类,不断地循环迭代,得到满足终止条件的最优聚类。实验证明该算法具有更好的聚类质量和综合性能。A good K-means clustering algorithm should meet two requirements at least.First,it can reflect the validity of clustering,in other words,clustering number eonsistents with the practical problems.Second,it has the ability to handle the noise.The traditional K-means algorithm is a local search algorithin,which is sensitive to initialization and easy to search a local maximum. To address this shorteoming,a new K-means algorithin is proposed to optimize the initial center.The algorithin finds k data objects,all of which are belong to high density area and the most far away to each other.Experiments show that the algorithin has not only the weak dependence on initial data,but also fast convergence and high clustering quality.To realize the validity of clustering and get clustering results of higher accuracy,the paper proposes a hybrid algorithin,which combines the optimal K- means algorithm and the genetic algorithm.The algorithm can automatically get the optimal value of k with high compact clusters and large separation between at least two clusters,and optimal k initial center in order to get better clustering,then continue to search iteratively to get the optimal solution.Experiments show that the hybrid method has better clustering quality and general performance.

关 键 词:聚类 K—means算法 遗传算法 

分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象