检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:方诗乔 胡佩玲 黄莹莹 张昕[1] FANG Shiqiao;HU Peiling;HUANG Yingying;ZHANG Xin(College of Mathematics and Informatics,South China Agricultural University,Guangzhou 510642,China)
机构地区:[1]华南农业大学数学与信息学院,广东广州510642
出 处:《现代信息科技》2023年第6期111-115,共5页Modern Information Technology
摘 要:针对K-means算法易受初始值和异常点影响,以及聚类数选取依靠人工经验和初始聚类中心选取随机等缺点,提出一种基于改进Canopy算法的K-means聚类算法。首先将初始数据集进行预处理和分类,然后选取特殊的阈值利用改进的Canopy算法得到聚类数和初始聚类中心,再运行K-means算法实现最终聚类。经检验得知,改进后的算法减少了对人工选择的依赖,并且聚类准确度有了明显的提高。最后将改进后的算法应用于顾客细分实例,取得了良好的分类效果,证明了优化算法的实用性。In view of the shortcomings of K-means algorithm that is easily affected by initial values and outliers,and that the selection of clustering number depends on artificial experience and the selection of initial clustering center is random,a K-means clustering algorithm based on improved Canopy algorithm is proposed.First,the initial data set is preprocessed and classified,and then a special threshold is selected to obtain the number of clusters and the initial cluster center using the improved Canopy algorithm,and then the K-means algorithm is run to achieve the final clustering.The test shows that the improved algorithm reduces the dependence on manual selection,and the clustering accuracy has significantly improved.Finally,the improved algorithm is applied to a customer segmentation example,and good classification results are obtained,which proves the practicability of the optimized algorithm.
关 键 词:Canopy算法 主成分分析法 局部密度 顾客细分
分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.229