基于事件驱动架构的分布式流处理弹性资源分配策略研究  被引量:3

Elastic Resource Allocation of Distributed Streaming Process Based on Event Driven Architecture

在线阅读下载全文

作  者:汤小春[1] 张克 赵全 李战怀[1] TANG Xiao-Chun;ZHANG Ke;ZHAO Quan;LI Zhan-Huai(School of Computer Science,Northwestern Polytechnical University,Xi’an 710129)

机构地区:[1]西北工业大学计算机学院,西安710129

出  处:《计算机学报》2023年第2期244-259,共16页Chinese Journal of Computers

基  金:国家重点研发计划(2018YFB1003400)资助。

摘  要:针对具有多个数据源以及多个输出的流处理应用,使用单个分布式数据流引擎开发时,不论在架构还是可扩展性方面都存在着不足,而基于事件驱动架构的分布式流处理技术是解决该问题的主要方式.但是,事件驱动架构应用于流处理时,往往面临着数据注入速率与数据处理速率不一致的矛盾,当流数据源的数量发生变化、数据值的分布发生波动时,会导致处理延迟加大或资源利用不充分.针对数据注入与数据处理不一致的问题,现有的弹性资源分配策略难以有效处理生产者和消费者之间的依赖关系,且资源分配效果欠佳.论文提出了一种基于强化学习的弹性资源分配方法,解决了具有依赖关系的流处理应用程序之间的数据波动带来的延迟或者资源利用不充分的问题.通过建立状态矩阵和命令矩阵,使得资源管理器能够感知上下游应用的状态变化,从而及时调整流处理应用的资源需求,保证了流处理应用执行过程的延迟要求,提高了系统的资源利用率.经过测试,基于强化学习的弹性资源分配与Spark动态资源分配方法相比,延迟能减少15%,资源利用率能提高20%以上,其吞吐量能够提高10%左右.With the development of cloud computing and big data,many data stream engines have appeared,such as Spark or Flink.When a single distributed data stream engine is used to develop a distributed stream processing application with multiple data sources and multiple query targets,there are some shortcomings not only in architecture,but also in scalability,which increases the difficulty for developers to develop large-scale applications.Therefore,distributed stream processing based on event-driven architecture has been adopted by more and more developers.It can perform well both in building simple or complex applications due to its low coupling and high scalability characteristic.An event-driven architecture based application consists of a series of single-purpose components responsible for asynchronously receiving and processing events.However,event-driven architecture faces the contradiction between the data injection rate and the data processing rate when the number of streaming data sources fluctuates and the distribution of data values changes,resulting in processing delays or insufficient resource utilization.Elastic resource allocation can solve the problems caused by data fluctuations by dynamically adjusting the resources allocated to the application based on changes in load pressure in the stream processing system,but the traditional strategy(e.g.back pressure mechanism or simple machine learning method)has shortcomings in event-driven architecture.It is difficult to effectively handle the dependencies between production and consumption,which is manifested as being unable to detect the fluctuations in the data generated by the producers in time to adjust the resources of consumers.Besides,it is difficult to grasp the timing of resource allocation,which will lead to the inability to provide resources in time and increase the delay of stream processing.This paper conducts an in-depth study and comparison of the existing elastic resource allocation strategies and analyzes the limitations of some existing strate

关 键 词:事件驱动 分布式流处理 弹性资源 强化学习 数据注入 

分 类 号:TP301[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象