Spark Streaming中参数与资源协同调整策略  被引量:2

Parameter and Resource Coordination Adjustment Strategy for Spark Streaming

在线阅读下载全文

作  者:梁毅[1] 刘飞 常仕禄 程石帆 LIANG Yi;LIU Fei;CHANG Shi-lu;CHENG Shi-fan(Computer School,Beijing University of Technology,Beijing 100124,China)

机构地区:[1]北京工业大学计算机学院,北京100124

出  处:《软件导刊》2019年第1期45-47,55,共4页Software Guide

基  金:国家自然科学基金项目(91546111;91646201);国家重点研发计划项目(2017YFC0803300);北京市教委项目(KZ201610005009)

摘  要:Spark Streaming是一种典型的批量流式计算平台,可用于处理持续到达的数据流。流式数据最重要的两个特征是波动性和时效性。利用动态调整系统参数和动态调整资源满足不同数据到达速率的响应延迟,但调整参数的方式具有局限性,其用户成本较大。因此提出一种参数和资源协同调整策略,采用动态邻域粒子群算法找到一种满足SLO目标且使用资源最少的系统方案。实验表明,AdaStreaming与DyBBS相比,延迟性降低了70.1%,在资源使用量上比DRA降低了42.1%。Spark Streaming is a typical batched streaming processing system that can be used to process continuously arriving data streams.The two most important characteristics of streaming data are its volatility and timeliness.The method of dynamical parameter configuration and dynamical resource allocation are proposed to guarantee the end to end latency with different data arrival rates.However,the method of dynamical parameter configuration has limitation on scope of application,and the method of dynamical resource allocation will bring greater cost to users.Therefore,this paper proposes a parameter and resource coordination adjustment strategy,using dynamic neighborhood particle swarm algorithm to find a solution that can achieve resource minimization on the premise of meeting the SLO goal.Experiments show that AdaStreaming reduced latency by 59%against DyBBS,and reduced the amount of resources by 34%against DRA.

关 键 词:SPARK STREAMING 动态邻域粒子群 参数配置 资源分配 

分 类 号:TP301[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象