基于流网络的Flink平台弹性资源调度策略  被引量:15

Flow-network based auto rescale strategy for Flink

在线阅读下载全文

作  者:李梓杨 于炯[1,2] 卞琛 张译天 蒲勇霖 王跃飞[1] 鲁亮 LI Ziyang;YU Jiong;BIAN Chen;ZHANG Yitian;PU Yonglin;WANG Yuefei;LU Liang(School of Information Science and Engineering,Xinjiang University,Urumqi 830046,China;School of Software,Xinjiang University,Urumqi 830008,China;College of Internet Finance and Information Engineering,Guangdong University of Finance,Guangzhou 510521,China;School of Computer Science and Technology,Civil Aviation University of China,Tianjin 300300,China)

机构地区:[1]新疆大学信息科学与工程学院,新疆乌鲁木齐830046 [2]新疆大学软件学院,新疆乌鲁木齐830008 [3]广东金融学院互联网金融与信息工程学院,广东广州510521 [4]中国民航大学计算机科学与技术学院,天津300300

出  处:《通信学报》2019年第8期85-101,共17页Journal on Communications

基  金:国家自然科学基金资助项目(No.61862060,No.61462079,No.61562086,No.61562078);国家科技部科技支撑基金资助项目(No.2015BAH02F01);新疆维吾尔自治区自然科学基金资助项目(No.2017D01A20);新疆维吾尔自治区高校科研计划基金资助项目(No.XJEDU2016S106)

摘  要:为了解决大数据流式计算平台中存在计算负载波动上升,但集群无法有效应对负载变化的问题,提出了基于流网络的Flink平台弹性资源调度策略(FAR-Flink)。该策略首先建立流网络模型并通过构建算法计算每条边的容量值,其次通过弹性资源调度算法确定集群性能瓶颈并制定动态资源调度计划,最后通过基于数据分簇和分桶管理的状态数据迁移算法,实施调度计划并完成节点间的高效数据迁移。实验结果表明,该策略在状态数据复杂的应用场景中有较好的优化效果,在满足计算时延约束的前提下提高了集群的吞吐量,缩短了状态数据迁移的时间。由此可见,FAR-Flink策略有效提升了集群对负载波动的响应能力。In order to solve the problem that the load of big data stream computing platform is increasing with fluctuation while the cluster was not able to rescale efficiently,the Flow-network based auto rescale strategy for Flink was proposed.Firstly,the flow-network model was set up and the capacity of each edge that was calculated by self-learning algorithm.Secondly,the bottleneck of the cluster was acquired by maximum-flow algorithm and the resource rescheduling plan was drawn up.Finally,the resource rescheduling plan was executed and the stateful data was migrated efficiently by the data migration algorithm based on the strategy of data partitioning by bulk and bucket.The experimental results show that the strategy can effectively provide performance promotion in the application with complex stateful data.It improved the throughput of the cluster and reduced the time overhead of the data migration on the premise of satisfying the latency constrain of the application,which means that the strategy promotes the scalability of the cluster efficiently.

关 键 词:流式计算 资源调度 弹性集群 负载迁移 Flink 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象