检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:潘佳艺 王芳[1,2,3] 杨静怡[1,2,3] 谭支鹏[1,2,3]
机构地区:[1]华中科技大学武汉光电国家实验室,湖北武汉430074 [2]华中科技大学计算机科学与技术学院,湖北武汉430074 [3]华中科技大学信息存储系统教育部重点实验室,湖北武汉430074
出 处:《计算机工程与科学》2017年第3期413-423,共11页Computer Engineering & Science
基 金:国家863计划(2013AA013203)
摘 要:随着基于Hadoop平台的大数据技术的不断发展和实践的深入,Hadoop YARN资源调度策略在异构集群中的不适用性越发明显。一方面,节点资源无法动态分配,导致优势节点的计算资源浪费、系统性能没有充分发挥;另一方面,现有的静态资源分配策略未考虑作业在不同执行阶段的差异,易产生大量资源碎片。基于以上问题,提出了一种负载自适应调度策略。监控集群执行节点和提交作业的性能信息,利用实时监控数据建模、量化节点的综合计算能力,结合节点和作业的性能信息在调度器上启动基于相似度评估的动态资源调度方案。优化后的系统能够有效识别集群节点的执行能力差异,并根据作业任务的实时需求进行细粒度的动态资源调度,在完善YARN现有调度语义的同时,可作为子级资源调度方案架构在上层调度器下。在Hadoop 2.0上实现并测试该策略,实验结果表明,作业的自适应资源调度策略显著提高了资源利用率,集群并发度提高了2到3倍,时间性能提升了近10%。With the development and practice of big data technology, Hadoop YARN (Yet Anouther Resource Negotiator) scheduler is no longer an effective solution in heterogeneous cluster environment. On the one hand, YARN cannot dynamically allocate the resources of nodes, which leads to a waste of better nodes' resources and poor overall system performance. On the other hand, YARN^s existing static resource allocation policy ignores the difference of the different stages, which causes a large num- ber of resource fragments. Aiming at the above problems, we put forward a load-adaptive feedback scheduling strategy. The system monitors the performance of all nodes and jobs, evaluates the compu- ting power of each node with the real-time monitoring data. Then the scheduler starts the dynamic re- source scheduling strategy based on the similarity assessment together with the monitoring information of nodes and jobs'performance. The optimized system can distinguish the heterogeneity of different nodes, allocate resources for tasks' real-time needs dynamically, refine YARN's scheduling semantics and be used as a secondary resource scheduling strategy of the upper scheduler. We implement and testthe strategy on Hadoop 2.0, and the experimental results show that this scheduling strategy can signifi- cantly improve the utilization rate of resources, improve the cluster's concurrency by 2 to 3 times, and enhance the performance by nearly 10%.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.249