K-means聚类算法研究综述  被引量:55

Review on K-means Clustering Algorithm

在线阅读下载全文

作  者:王森 刘琛 邢帅杰 Wang Sen;Liu Chen;Xing Shuaijie(School of Science,East China Jiaotong University,Nanchang 330013,China)

机构地区:[1]华东交通大学理学院,江西南昌330013

出  处:《华东交通大学学报》2022年第5期119-126,共8页Journal of East China Jiaotong University

基  金:江西省自然科学基金项目(2019ZACBL20010)。

摘  要:聚类分析是数据挖掘的重要技术,而在5G时代,海量的数据维度高、数据集大,利用K-means算法易受离群点的影响,且K值、初始聚类中心的选取影响聚类结果的稳定性和准确率,甚至导致聚类陷入局部最优,对K-means算法的改进受到众多研究者的关注。主要对K-means聚类的研究现状进行归纳总结。首先,介绍K-means算法的思想原理;其次,针对初始聚类中心点的选取、K值确定、离群点对现有改进算法进行基于密度和距离的分类总结,并对各个改进算法的优势和缺陷进行分析;最后对K-means算法未来可能的研究方向和趋势进行展望。Cluster analysis is an important technique for data mining. In the 5G era, massive data has high dimensions and large data sets. The K-means algorithm is susceptible to outliers, and the k value and the selection of initial clustering centers affect the stability and accuracy of the clustering result. It even causes the clustering to fall into the local optimum, so the improvement of the K-means algorithm has attracted the attention of many researchers. This article mainly summarizes the current research status of K-means clustering. Firstly, it introduces the principle of K-means algorithm. Secondly, according to the selection of the initial clustering center point, the determination of the K value, and the outliers, the existing improved algorithms are classified and summarized based on density and distance, and the advantages and disadvantages of each improved algorithm are analyzed. Finally, the K-means algorithm is analyzed and prospects for possible future research directions and trends are discussed.

关 键 词:K-MEANS 聚类算法 K值 初始聚类中心 离群点 密度 距离 

分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象