改进的方差优化初始中心的K-medoids算法  被引量:1

An Improved K-medoids Algorithm for Initial Center of Variance Optimization

在线阅读下载全文

作  者:张晓滨[1] 母玉雪 ZHANG Xiao-bin;MU Yu-xue(School of Computer Science,Xi’an Polytechnic University,Xi’an 710600,China)

机构地区:[1]西安工程大学计算机科学学院,陕西西安710600

出  处:《计算机技术与发展》2020年第7期42-45,134,共5页Computer Technology and Development

基  金:陕西省自然科学基金(2015JQ5157)。

摘  要:针对传统K-medoids算法对于初值敏感、容易陷入局部最优解、稳定性差等缺点和方差优化初始中心的K-medoids聚类算法的时间复杂度较高、邻域半径不够精确等问题,提出一种改良的基于方差优化初始中心的K-medoids聚类算法。该算法引入了全局方差的概念,并将其作为样本的密度参数,选择部分方差值较小的样本作为候选初始聚类中心样本集,并利用最大距离乘积法从候选初始聚类中心样本集中选取方差值较小且距离较远的K个样本当作初始聚类中心,该算法充分兼顾了初始聚类中心的分散性和代表性。在更新簇类中心时,根据样本密度原则逐步扩大搜索范围,代替了传统的随机选取。通过在UCI数据集上的实验结果表明,该算法不仅有效优化了初始聚类中心点的选取,同时也有效改进了聚类速度和聚类效果。Aiming at the disadvantages of traditional K-medoids algorithm such as sensitivity to initial value,falling into local optimal solution easily,poor stability and the problems of variance optimization initial center K-medoids algorithm such as high time complexity and inaccurate neighborhood radius,we propose an improved K-medoids clustering algorithm based on the initial center of variance optimization. The concept of global variance is introduced in this algorithm and taken as a sample density parameters. Some smaller values of the variance of sample set are chosen as a candidate for the initial clustering center,and the method of maximum distance product is used to select K samples with small variance and far distance from the candidate initial clustering center set as the initial clustering center. The algorithm gives full consideration to the dispersion and representativeness of the initial clustering center. When updating the cluster center,the search scope is gradually expanded according to the sample density principle,which replaces the traditional random selection. Experimental results on UCI data set show that the proposed algorithm not only effectively optimizes the selection of initial clustering center,but also effectively improves the clustering speed and clustering effect.

关 键 词:K-medoids算法 初始聚类中心 方差优化 最大距离乘积法 样本密度 

分 类 号:TP301[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象