检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:殷丽凤[1] 栗庆杰 YIN Lifeng;LI Qingjie(School of Software,Dalian Jiaotong University,Dalian 116024,China)
出 处:《大连交通大学学报》2024年第2期115-119,共5页Journal of Dalian Jiaotong University
基 金:国家自然科学基金项目(61771087)。
摘 要:启发式k-means聚类算法通过在k-means第一次迭代后查看附近的集群来预测每个数据点可能会被划分到的集群子集,有效地加快了算法的运行速度。但由于启发式算法存在随机选择初始聚类中心以及无法有效识别数据集中离群点的缺陷,导致聚类结果的误差平方和较大并且轮廓系数偏小。针对这一问题,提出了CHk-means算法,该算法引入仔细播种方法,克服了启发式k-means算法随机选择初始聚类中心带来的局部最优解问题;该算法引入局部异常因子LOF算法对离群点进行检测,降低了离群点数据对聚类结果的影响。在多个数据集上对3种算法进行对比试验,结果表明CHk-means算法可有效降低聚类结果的误差平方和,增强聚类的轮廓系数,使聚类质量得到明显改善。The heuristic k-means algorithm predicts the subset of clusters to each data point which is likely to be classified by looking at nearby clusters after the first iteration of k-means,effectively speeding up the operation of the algorithm.However,due to the shortcomings of the heuristic algorithm in randomly selecting the initial clustering center and being unable to effectively identify outliers in the data set,the sum of squared errors in the clustering results is large,and the silhouette coefficient is small.To address this problem,the CHk-means algorithm is proposed.This algorithm introduces a careful seeding method to overcome the local optimal solution problem caused by the heuristic k-means algorithm random selection of the initial cluster center.This algorithm introduces the local outlier factor LOF algorithm to detect outliers,reducing the impact of outlier data on clustering results.Comparative experiments were conducted on three algorithms on multiple data sets.The results show that the CHk-means algorithm can effectively reduce the sum of square errors of clustering results,enhance the silhouette coefficient of clustering,and significantly improve the clustering quality.
关 键 词:聚类算法 K-MEANS 启发式算法 仔细播种 局部异常因子 离群点
分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7