检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王森 刘琛 邢帅杰 Wang Sen;Liu Chen;Xing Shuaijie(School of Science,East China Jiaotong University,Nanchang 330013,China)
出 处:《华东交通大学学报》2022年第5期119-126,共8页Journal of East China Jiaotong University
基 金:江西省自然科学基金项目(2019ZACBL20010)。
摘 要:聚类分析是数据挖掘的重要技术,而在5G时代,海量的数据维度高、数据集大,利用K-means算法易受离群点的影响,且K值、初始聚类中心的选取影响聚类结果的稳定性和准确率,甚至导致聚类陷入局部最优,对K-means算法的改进受到众多研究者的关注。主要对K-means聚类的研究现状进行归纳总结。首先,介绍K-means算法的思想原理;其次,针对初始聚类中心点的选取、K值确定、离群点对现有改进算法进行基于密度和距离的分类总结,并对各个改进算法的优势和缺陷进行分析;最后对K-means算法未来可能的研究方向和趋势进行展望。Cluster analysis is an important technique for data mining. In the 5G era, massive data has high dimensions and large data sets. The K-means algorithm is susceptible to outliers, and the k value and the selection of initial clustering centers affect the stability and accuracy of the clustering result. It even causes the clustering to fall into the local optimum, so the improvement of the K-means algorithm has attracted the attention of many researchers. This article mainly summarizes the current research status of K-means clustering. Firstly, it introduces the principle of K-means algorithm. Secondly, according to the selection of the initial clustering center point, the determination of the K value, and the outliers, the existing improved algorithms are classified and summarized based on density and distance, and the advantages and disadvantages of each improved algorithm are analyzed. Finally, the K-means algorithm is analyzed and prospects for possible future research directions and trends are discussed.
关 键 词:K-MEANS 聚类算法 K值 初始聚类中心 离群点 密度 距离
分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.12.102.204