基于MapReduce的模糊K-means算法并行化研究被引量：1

Research on Parallelization of Fuzzy K-means Algorithm Based on MapReduce

作　　者：杨延庆袁华兵 YANG Yanqing;YUAN Huabing(Division of Information Technology,Xi'an Medical University,Xi'an 710021)

出　　处：《计算机与数字工程》2020年第7期1564-1567,1765,共5页Computer & Digital Engineering

基　　金：陕西省青年科学基金项目(编号:71701160);西安医学院教学改革研究项目(编号:2018JG-07)资助。

摘　　要：模糊K-means算法是一种能够定量地确定事物亲属关系的软聚类算法,由于该算法在大规模数据的分析和处理中存在的不足,因此提出一种基于MapReduce模型的并行化实现。首先在Map函数的输出传递给其他节点的Reduce函数之前,改进Combine函数设计,增加本地中间结果处理,减少通信开销,以提高MapReduce任务计算速度。然后在Hadoop分布式计算平台上对多组规模不同的数据集进行测试。实验表明,基于MapReduce的并行模糊K-means算法适合大规模数据的分析和处理,而且执行速度提高了约1.9倍,聚类效果更为显著。The fuzzy K-means algorithm is a kind of important soft clustering algorithm which can quantitatively determine the relation of different objects.In view of the shortcomings of fuzzy K-means algorithm in large-scale data processing,therefore,this paper puts forward parallel implementation based on MapReduce programming model.First,in order to improve the computing speed of the MapReduce task,it can improve the design of the Combine function,add the local intermediate result processing and reduce the communication overhead before the output of the Map function is passed to the Reduce function of other nodes.Then,several sets of data sets with different sizes are tested on the Hadoop distributed computing platform.The experiments show that the parallel fuzzy K-means algorithm based on MapReduce is suitable for the analysis and processing of large-scale data,and the execution speed is increased by about 1.9 times,and the clustering effect is more remarkable.

关键词：模糊K-means MAPREDUCE模型 Combine函数 HADOOP平台

分类号：TP301[自动化与计算机技术—计算机系统结构]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于MapReduce的模糊K-means算法并行化研究被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于MapReduce的模糊K-means算法并行化研究 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于MapReduce的模糊K-means算法并行化研究被引量：1