改进的Hadoop作业调度算法  被引量:5

Improvement of job scheduling algorithm on Hadoop

在线阅读下载全文

作  者:冯兴杰[1] 贺阳[1] FENG Xingjie;HE Yang(School of Computer Science and Technology, Civil Aviation University of China, Tianjin 300300, China)

机构地区:[1]中国民航大学计算机科学与技术学院,天津300300

出  处:《计算机工程与应用》2017年第12期85-91,共7页Computer Engineering and Applications

基  金:国家自然科学基金委员会与中国民用航空局联合基金项目(No.U1233113);国家自然科学基金(No.61301245;No.61201414)

摘  要:分布式集群普遍存在负载均衡问题,而Hadoop没有考虑到节点间性能的差异.虽然有负载均衡机制,但是效果不太理想,因此运行过程中经常会出现负载不均衡的情况。针对如上问题,深入分析了Hadoop源代码,理清了Hadoop的运行原理,在Hadoop资源管理机制Yarn中改进了Hadoop任务的排序,建立了新的任务排序规则,提出了对各节点性能评价的指标,分为动态性能指标和静态性能指标。在此基础上对Yarn的Fair Scheduler算法进行了改进,形成了考虑节点性能的调度算法。重新对Hadoop源码进行了编译,在所搭建的Hadoop平台上进行了对比实验,证明了加入节点性能指标有效解决了Hadoop负载均衡问题,对Hadoop的运行效率有了很大提高。Distributed cluster has the problem of load balancing,and the Hadoop does not take into account the differencesin the performance of the nodes.Although it has a load balancing mechanism,the effect is not ideal.As a result,there isoften a load imbalance in the process of running.In view of the above problem,this paper has in-depth analysis ofthe Hadoop source code,to clarify of hadoop principle,and improves Hadoop task scheduling in Yarn which is resourcemanagement mechanism of Hadoop.Then establishes new task scheduling rules,and also proposes a performance evaluationindex for each node,performance evaluation includes dynamic performance and static performance.On the basis ofthis,this paper improves FairScheduler algorithm of Yarn,and forms a scheduling algorithm considering the performanceof nades.To recompile the Hadoop source code,and comparative experiment which carries out on the Hadoop platform,and proves the performance index of the join node can effectively solve the problem of Hadoop load balancing,greatlyimproves of running efficiency on Hadoop.

关 键 词:大数据 HADOOP YARN 负载均衡 FairScheduler 算法 

分 类 号:TP302.7[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象