基于累计工作量的在线大数据分析作业调度算法  被引量:6

Online task scheduling algorithm for big data analytics based on cumulative running work

在线阅读下载全文

作  者:李叶飞 徐超 许道强 邹云峰 张晓达 钱柱中[1] LI Yefei;XU Chao;XU Daoqiang;ZOU Yunfeng;ZHANG Xiaoda;QIAN Zhuzhong(Department of Computer Science and Technology,Nanjing University,Nanjing Jiangsu 210023,China;Electric Power Research Institute,State Grid Jiangsu Electric Power Company Limited,Nanjing Jiangsu 210000,China;State Grid Jiangsu Electric Power Company Limited,Nanjing Jiangsu 210000,China)

机构地区:[1]南京大学计算机科学与技术系,南京210023 [2]国网江苏省电力有限公司电力科学研究院,南京210000 [3]国网江苏省电力有限公司,南京210000

出  处:《计算机应用》2019年第8期2431-2437,共7页journal of Computer Applications

基  金:国家自然科学基金项目(61472181);江苏省自然科学基金项目(BK20151392)~~

摘  要:针对Hadoop和Spark等大数据分析系统中无先验知识任务的高效执行问题,设计了基于累计工作量(CRW)的任务调度器CRWScheduler。该调度器根据CRW将任务在低权重队列与高权重队列间切换;在为作业分配资源时,同时考虑到作业所在的队列和其瞬时占用资源量,无需作业先验知识即显著提升系统性能。基于ApacheHadoopYARN实现了CRWScheduler原型,在28个节点的基准测试集群上的实验表明,与YARN的公平调度机制相比,作业流时间(JFT)平均降低21%,其中95百分位的作业流时间(JFT)最多降低了35%,并且在与任务级调度程序协作时可获得进一步的性能提升。A Cumulative Running Work (CRW) based task scheduler CRWScheduler was proposed to effectively process tasks without any prior knowledge for big data analytics platform like Hadoop and Spark.The running job was moved from a low-weight queue to a high-weight one based on CRW.When resources were allocated to a job,both the queue of the job and the instantaneous resource utilization of the job were considered,significantly improving the overall system performance without prior knowledge.The prototype of CRWScheduler was implemented based on Apache Hadoop YARN.Experimental results on 28-node benchmark testing cluster show that CRWScheduler reduces average Job Flow Time (JFT) by 21% and decreases JFT of 95th percentile by up to 35% compared with YARN fair scheduler.Further improvements can be obtained when CRWScheduler cooperates with task-level schedulers.

关 键 词:数据分析系统 作业流时间 公平性 饥饿避免 

分 类 号:TP316.4[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象