基于MapReduce的人工蜂群算法在大数据中的应用  被引量:3

Application Research of A MapReduce-based Artificial Bee Colony for Large-scale Data Clustering

在线阅读下载全文

作  者:李果 袁小凯 许爱东 张乾坤 张福铮 LI Guo;YUAN Xiaokai;XU Aidong;ZHANG Qiankun;ZHANG Fuzheng(Southern Power Grid Institute of Science,Guangzhou 510080)

机构地区:[1]南方电网科学研究院

出  处:《计算机与数字工程》2020年第1期124-129,146,共7页Computer & Digital Engineering

基  金:国家自然科学基金项目(编号:61672393)资助

摘  要:随着信息技术的不断进步,数据规模不断增大。聚类是一种典型的数据分析方法,尤其是对大规模数据进行聚类分析近年来备受关注。针对现有序列聚类算法在对大规模数据进行聚类时,在内存空间和计算时间方面开销较大的问题,提出了基于MapReduce的人工蜂群聚类算法,通过引入MapReduce并行编程范式,快速计算聚类中心适应度,可实现对大规模数据的高效聚类。基于仿真和真实的磁盘驱动器制造两类数据,对算法的聚类效果、可扩展性和聚类效率进行了验证。实验结果表明,与现有PK-Means算法和并行K-PSO算法相比,论文算法具有更好的聚类效果、更强的扩展性和更高的聚类效率。With the development of information technology,the scale of digital data is increasing.Clustering is a typical data analysis technology for large-scale data.In recent years,the clustering technology is increasingly concerned.The computational cost of most sequential clustering algorithms is expensive in terms of memory space and the time complexities.In this paper,an improved artificial bee colony based on MapReduce for large-scale data clustering is proposed.The MapReduce programming paradigm is in troduced in this algorithm to calculate the fitness.The quality,scalability and efficiency of the proposed algorithm are tested by us ing two datasets,the synthetic dataset and the manufacturing dataset obtained from a disk drive manufacturing process.Experimen tal results show that this algorithm performs better in clustering effect,s calability and computational efficiency compared with PK-Means and parallel K-PSO.

关 键 词:大数据 MAPREDUCE 人工蜂群 聚类 并行编程范式 

分 类 号:TN911.34[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象