基于改进GSA的数据聚类机制  被引量:4

DATA CLUSTERING MECHANISM BASED ON IMPROVED GSA

在线阅读下载全文

作  者:张小庆 Zhang Xiaoqing(School of Mathematics and Computer Science,Wuhan Polytechnic University,Wuhan 430048,Hubei,China)

机构地区:[1]武汉轻工大学数学与计算机学院,湖北武汉430048

出  处:《计算机应用与软件》2021年第2期27-32,84,共7页Computer Applications and Software

基  金:湖北省自然科学基金项目(2018CFB407);武汉轻工大学校立科研项目(2019y07)。

摘  要:数据聚类是大数据分析的基本手段,传统聚类方法易于陷入局部最优。针对这一问题,提出一种基于改进引力搜索机制GSA的数据聚类算法。定义一种适合于引力搜索进化的聚类解编码方式。为了衡量不同聚类解的差异,设计一种基于汉明距离的引力搜索粒子距离度量方法,有效衡量数据对象在各维度属性上的不同。同时,在粒子速度更新方面,引入加速因子到粒子速度更新策略中,利用最优粒子位置代表的聚类解加速局部开发过程,加速粒子向最优粒子移动,有效保持局部开发与全局搜索间的平衡。实验结果表明,在经典数据集测试下,该算法在多数测试集中比同类算法具有更低的聚类失误率。Data clustering is the basic means of big data analysis.The traditional clustering method is easy to fall into local optimum.In order to solve this problem,a data clustering algorithm based on the improved GSA is proposed.I defined an encoding method of the clustering solution that is suited to the gravitational search evolution.For measuring the difference between different clustering solutions,I designed a distance measurement method between two gravitational search particle based on hamming distance,which could effectively measure the difference of data objects in each dimension property.In terms of the particle velocity update,I introduced the acceleration factor to the particle velocity updating strategy,which could use the optimal particle’s position to accelerate local development process,accelerate the particles moving to the optimal particle,and effectively balance the regional development and global search.The experimental results under the classical data set test show that my algorithm has lower clustering error rate than similar algorithms in most test sets.

关 键 词:数据聚类 引力搜索 汉明距离 聚类间距 

分 类 号:TP393.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象