Adaptive watermark generation mechanism based on time series prediction for stream processing  被引量:1

在线阅读下载全文

作  者:Yang SONG Yunchun LI Hailong YANG Jun XU Zerong LUAN Wei LI 

机构地区:[1]Sino-German Joint Software Institute,Beihang University,Beijing 100191,China [2]School of Computer Science and Engineering,Beihang University,Beijing 100191,China [3]Science and Technology on Space System Simulation Laboratory Beijing Simulation Center,Beijing 100854,China [4]College of Life Sciences and Bioengineering,Beijing University of Technology,Beijing 100083,China

出  处:《Frontiers of Computer Science》2021年第6期59-73,共15页中国计算机科学前沿(英文版)

基  金:This work was supported by National Key Research and Development Program of China(2020YFB1506703);the National Natural Science Foundation of China(Grant No.62072018).

摘  要:The data stream processing framework processes the stream data based on event-time to ensure that the request can be responded to in real-time.In reality,streaming data usually arrives out-of-order due to factors such as network delay.The data stream processing framework commonly adopts the watermark mechanism to address the data disorderedness.Watermark is a special kind of data inserted into the data stream with a timestamp,which helps the framework to decide whether the data received is late and thus be discarded.Traditional watermark generation strategies are periodic;they cannot dynamically adjust the watermark distribution to balance the responsiveness and accuracy.This paper proposes an adaptive watermark generation mechanism based on the time series prediction model to address the above limitation.This mechanism dynamically adjusts the frequency and timing of watermark distribution using the disordered data ratio and other lateness properties of the data stream to improve the system responsiveness while ensuring acceptable result accuracy.We implement the proposed mechanism on top of Flink and evaluate it with realworld datasets.The experiment results show that our mechanism is superior to the existing watermark distribution strategies in terms of both system responsiveness and result accuracy.

关 键 词:data stream processing WATERMARK time series based prediction dynamic adjustment 

分 类 号:TP309.7[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象