融合集群度与距离均衡优化的K-均值聚类算法被引量：3

K-means clustering algorithm based on cluster degree and distance equilibrium optimization

出　　处：《计算机应用》2018年第1期104-109,115,共7页journal of Computer Applications

基　　金：国家自然科学基金资助项目(61502262);山东省研究生教育创新计划项目(SDYY16023)~~

摘　　要：针对传统K-均值算法对初始聚类中心选择较为敏感的问题,提出了一种基于融合集群度与距离均衡优化选择的K-均值聚类(K-MCD)算法。首先,基于"集群度"思想选取初始簇中心;然后,遵循所有聚类中心距离总和均衡优化的选择策略,获得最终初始簇中心;最后,对文本集进行向量化处理,并根据优化算法重新选取文本簇中心及聚类效果评价标准进行文本聚类分析。对文本数据集从准确性与稳定性两方面进行仿真实验分析,与K-均值算法相比,K-MCD算法在4个文本集上的聚类精确度分别提高了18.6、17.5、24.3与24.6个百分点;在平均进化代数方差方面,K-MCD算法比K-均值算法降低了36.99个百分点。仿真结果表明K-MCD算法能有效提高文本聚类精确度,并具有较好的稳定性。To deal with the problem that the traditional K-means algorithm is sensitive to the initial clustering center selection, an algorithm of K-Means clustering based on Clustering degree and Distance equalization optimization （K-MCD） was proposed. Firstly, the initial clustering center was selected based on the idea of ＂cluster degree＂. Secondly, the selection strategy of total clustering center distance equilibrium optimization was followed to obtain the final initial clustering center. Finally, the text set was vectorized, and the text cluster center and the evaluation criteria of text clustering were rese｜ected to perform text clustering analysis according to the optimization algorithm. The analysis of simulation experiment for the text data set was carried out from the aspects of accuracy and stability. Compared with K-means algorithm, the clustering accuracy of K- MCD algorithm was improved by 18.6, 17.5, 24.3 and 24.6 percentage points respectively for four text sets; the average evolutionary algebraic variance of K-MCD algorithm was 36. 99 percentage points lower than K-means algorithm. The experimental results show that K-MCD algorithm can improve text clustering accuracy with good stability.

关键词：初始聚类中心 K-均值算法集群度距离均衡优化文本聚类

分类号：TP301.6[自动化与计算机技术—计算机系统结构] TP18[自动化与计算机技术—计算机科学与技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

融合集群度与距离均衡优化的K-均值聚类算法被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

融合集群度与距离均衡优化的K-均值聚类算法 被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

融合集群度与距离均衡优化的K-均值聚类算法被引量：3