离散粒子群优化算法实现MapReduce负载平衡  被引量:1

Discrete Particle Swarm Optimization Algorithm for MapReduce Load Balance

在线阅读下载全文

作  者:李安颖 陈群[1] 宋荷 LI Anying;CHEN Qun;SONG He(School of Computer Science and Engineering,Northwestern Polytechnical University,Xi’an 710072,China)

机构地区:[1]西北工业大学计算机学院,陕西西安710072

出  处:《自动化仪表》2018年第12期56-59,共4页Process Automation Instrumentation

摘  要:MapReduce是Hadoop的核心模型之一,广泛应用于大数据处理。MapReduce模型将计算分为Map和Reduce两个处理阶段。但由于其自身的分区机制,导致在Reduce阶段处理数据时,会出现负载不平衡的数据倾斜问题。为了解决数据倾斜问题,提出利用离散粒子群算法解决Reduce阶段数据负载平衡问题。将数据分区策略与粒子群算法相结合,提高系统的稳定性。通过设置使数据分区均衡的目标函数,利用离散粒子群算法求解目标函数。试验结果证明,当设置不同数量的Reduce时,离散粒子群分区方式的运行时间均为最短,可有效解决数据分区的不平衡问题,并大大提升系统的计算效率。MapReduce is one of the core models of Hadoop,and is widely used in big data processing.The MapReduce model divides the computation into two stages:Map and Reduce.However,due to its own partition mechanism,the problem of load unbalanced data skew occurs when data is processed in the Reduce phase.In order to solve the problem of data skew,discrete particle swarm optimization algorithm is proposed to resolve data load balancing of Reduce phase.By combining the data partitioning strategy with particle swarm optimization algorithm,the stability of the system is improved.By setting the target function of data partition equilibrium,the discrete particle swarm algorithm is used to solve the target function.The experimental results show that when different number of reduce are set,the running time of discrete particle swarm partition way is the shortest,which effectively solve the unbalance of data partition,and greatly improve the computational efficiency of the system.

关 键 词:分布式计算 离散粒子群优化算法 数据倾斜 数据平衡 分区 

分 类 号:TH123.1[机械工程—机械设计及理论] TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象