检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王绪亮 聂铁铮 唐欣然 黄菊 李迪 闫铭森 刘畅 WANG Xu-liang;NIE Tie-zheng;TANG Xin-ran;HUANG Ju;LI Di;YAN Ming-sen;LIU Chang(School of Computer Science and Engineering,Northeastern University,Shenyang 110169,China;College of Software,Northeastern University,Shenyang 110169,China)
机构地区:[1]东北大学计算机科学与工程学院,沈阳110169 [2]东北大学软件学院,沈阳110169
出 处:《计算机科学》2020年第11期122-127,共6页Computer Science
摘 要:在现代大数据处理应用场景中,流数据处理技术的应用十分广泛。消息中间件或消息队列常在流数据处理中起到数据缓冲的作用。Apache Kafka常被用作数据缓冲中间件,Kafka的工作性能在很大程度上决定着应用系统整体的性能。在实际应用中,Kafka的上游数据源所产生的数据流量通常是不稳定的,静态的缓存策略不能适应这种多变的生产环境。针对这一问题,如果存在一种策略能根据上游流量变化动态调整数据缓存,就能增强系统对环境的适应能力,实现流数据缓存处理的实时性和吞吐量性能的提升。动态缓存策略采用对上游数据流量监控的方法,通过使用ARIMA模型对未来流量进行预测,提前调整流数据存储转发设置。流数据缓存设置参数的最佳值来源于在各压力下对中间件系统性能进行实验得到的结果的多目标优化。对比实验结果证明,在流数据高峰到达期间,策略在保证一定最大延迟的前提下可以使Apache Kafka的数据缓冲吞吐量性能提高150%以上,从而提高了系统的整体性能。In current scenarios of the big data processing application,the streaming data processing technique is widely used.Mess age middleware or message queue is usually applied as the data buffer in streaming data processing.Apache Kafka is often used as the data buffer middleware.The performance of Kafka largely determines the overall performance of the application system.In practical applications,the streaming data generated by upstream data sources is usually unstable,and the static data caching strategy cannot adapt to this variable production environment.In view of this problem,if there is a strategy that can dynamically adjust the data cache according to the upstream traffic changes,the adaptability of the system to environment can be enhanced,the real-time processing of streaming data caching can be realized and the throughput performance can also be improved.In the dynamic caching strategy,a method of monitoring the upstream data traffic is proposed,and the ARIMA model is used to predict the future traffic of data streaming,so as to adjust the settings of streaming data storage in advance.The optimum setting parameter of streaming data cache comes from multi-objective optimization of the experimental results of middleware system performance under various pressures.Comparative experimental results show that,during the peak period of streaming data,the strategy can improve the throughput performance of Apache Kafka by more than 150%while guaranteeing a certain maximum delay,thus the overall performance of the message middleware system can be improved.
关 键 词:Apache Kafka平台 时序预测 多目标优化 流数据处理 消息中间件
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.148.236.97