K-means算法的优化及应用  被引量:3

Optimization and Application of K-means Algorithm

在线阅读下载全文

作  者:方诗乔 胡佩玲 黄莹莹 张昕[1] FANG Shiqiao;HU Peiling;HUANG Yingying;ZHANG Xin(College of Mathematics and Informatics,South China Agricultural University,Guangzhou 510642,China)

机构地区:[1]华南农业大学数学与信息学院,广东广州510642

出  处:《现代信息科技》2023年第6期111-115,共5页Modern Information Technology

摘  要:针对K-means算法易受初始值和异常点影响,以及聚类数选取依靠人工经验和初始聚类中心选取随机等缺点,提出一种基于改进Canopy算法的K-means聚类算法。首先将初始数据集进行预处理和分类,然后选取特殊的阈值利用改进的Canopy算法得到聚类数和初始聚类中心,再运行K-means算法实现最终聚类。经检验得知,改进后的算法减少了对人工选择的依赖,并且聚类准确度有了明显的提高。最后将改进后的算法应用于顾客细分实例,取得了良好的分类效果,证明了优化算法的实用性。In view of the shortcomings of K-means algorithm that is easily affected by initial values and outliers,and that the selection of clustering number depends on artificial experience and the selection of initial clustering center is random,a K-means clustering algorithm based on improved Canopy algorithm is proposed.First,the initial data set is preprocessed and classified,and then a special threshold is selected to obtain the number of clusters and the initial cluster center using the improved Canopy algorithm,and then the K-means algorithm is run to achieve the final clustering.The test shows that the improved algorithm reduces the dependence on manual selection,and the clustering accuracy has significantly improved.Finally,the improved algorithm is applied to a customer segmentation example,and good classification results are obtained,which proves the practicability of the optimized algorithm.

关 键 词:Canopy算法 主成分分析法 局部密度 顾客细分 

分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象