基于数据局部性的推测式Hadoop任务调度算法研究  被引量:9

Speculative task scheduling algorithm based on locality of data in Hadoop

在线阅读下载全文

作  者:刘奎[1] 刘向东[1] 马宝来[2] 王翠荣[1] 

机构地区:[1]东北大学秦皇岛分校计算机与通信学院,河北秦皇岛066004 [2]东北大学信息与工程学院,沈阳110000

出  处:《计算机应用研究》2014年第1期182-187,共6页Application Research of Computers

基  金:国家自然科学基金资助项目(61070162;71071028)

摘  要:针对Hadoop平台现有任务调度算法优化程度不高的问题,提出了一种基于数据局部性的推测式任务调度算法。该算法通过计算节点上Map和Reduce任务时长比例,结合不同节点上数据的局部特性,采用了比现有算法更精确的任务进度探测方式找出快慢节点,在快节点上启动剩余时间最长的落后任务的备份任务,用移动计算代替移动数据。在Hadoop环境中进行了实验,结果表明该算法比现有算法缩短了任务平均运行时间,加快了任务的执行效率。For the reason that the existing algorithm on Hadoop doesnt have a high level of optimization, this paper presented a novel task scheduling algorithm based on data locality speculation. By calculating the time duration ratio of Map and Reduce task on each node combined with the local characteristics of tasks and data on different nodes, the algorithm introduced a more accurate task detection mechanism, and then launched backup tasks of slow tasks on fast nodes. For using computing migration instead of data migration, the algorithm can obtain higher efficiency. Experimental results in Hadoop show that compared with the existing scheduling algorithm, the algorithm proposed in this paper can shorten the task average operation time and reduce the network congestion caused by data exchange between cluster racks. It also can speed up the task execution efficiency.

关 键 词:HADOOP 任务调度 异构环境 数据局部性 

分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象