数据流滑动窗口连接查询降载策略研究  

A Relatively Effective and Practical Load Shedding Strategy for Sliding-Window Join Queries over Data Streams

在线阅读下载全文

作  者:张龙波[1] 李战怀[1] 朱立平[1] 刘江涛[1] 赵以强[2] 

机构地区:[1]西北工业大学计算机学院,陕西西安710072 [2]山东理工大学,山东淄博255049

出  处:《西北工业大学学报》2006年第5期595-599,共5页Journal of Northwestern Polytechnical University

基  金:国家自然科学基金(60573096);国家教育部博士点基金(2069901)资助

摘  要:主要研究了在有限内存条件下数据流滑动窗口的近似连接查询,即数据流滑动窗口连接查询的降载问题。通过对连接属性域的划分,根据数据元组的连接属性值在属性域中的数据分布来决定每个数据元组进入参加连接运算的滑动窗口的概率,给出了一种面向数据流滑动窗口连接查询的语义降载策略。与已有的语义降载策略相比,文中给出的降载策略所需的数据统计信息较少,连接运算的结果数据元组便于进一步进行其它查询处理,并且对于各种倾斜(skew)参数的数据分布和不同程度的系统超载都有较好的适应性。理论分析和实验结果表明,该降载策略对数据流滑动窗口连接查询的降载处理具有较高的有效性和实用性。Aim. The strategies of Refs. 2-5 appear, in our opinion, to be still not quite effective and practical for shedding load from sliding-window join queries over data streams. We now present a new strategy that is relatively better. In the full paper, we explain our new strategy in detail; in the abstract, we just add some pertinent remarks to listing the three topics of our explanation: (A) problem description; (B) sliding-window join queries; under topic B, Fig. 1 in the full paper is the schematic showing the three operations insert, probe and invalidate taken from Ref. 2; (C) a load shedding strategy based on the partition of the domain of join attributes; under topic C, we derive eqs. (1) and (2); also under topic C, Fig. 2 in the full paper is a schematic showing how to execute this strategy with two operator modules X1 and X2; the strategy is essentially that the domain of the join attributes is partitioned into certain sub-domains, and tuples are dropped according to their join values by maintaining simple data stream statistics. We performed two experiments: experiment 1 is concerned with the effect of different skew parameters of zipf distribution; experiment 2 is concerned with the effect of different overloadings. Results of experiments are shown in Figs. 3 and 4 in the full paper. Our new strategy needs fewer statistics of input data streams and it makes it convenient to further process the outputs of join operation. It also has good adaptability for different skew parameters of zipf distribution and different peak loads. The theoretical analysis and experiments show preliminarily that the new load shedding strategy is effective and efficient for window join queries.

关 键 词:数据流 滑动窗口 连接查询 降载 

分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象