机构地区:[1]School of Computer Science and Technology,Huazhong University of Science and Technology [2]School of Computer Science and Technology,Wuhan University of Science and Technology
出 处:《Journal of Southwest Jiaotong University(English Edition)》2010年第2期160-165,共6页西南交通大学学报(英文版)
基 金:The National High Technology Research and Development Program of China(No.2007AA01Z309);the National Natural Science Foundation of China(No.60803160,No.60873030)
摘 要:In data streams or web scenarios at highly variable and unpredictable rates, a good join algorithm should be able to "hide" the delays by continuing to output join results. The non-blocking algorithms allow some tuples to be flushed onto disk, with the goal of producing results continuously when data transmission is suspended. But state-of-the-art algorithms have trouble with the constraint of allocated memory. To make better use of memory, a novel non-blocking join algorithm based on hash-merge for improving query response times is proposed. The reduced data structure of in-memory tuples helps to improve memory utility. A replacement selection tree is applied to adjust memory by expanding or shrinking the size of the tree and separates one external join transaction into multi-subtasks. In addition, a cost model to estimate task output rate is proposed to select the in-disk portion that promises to produce the fastest results in the external join stage. Experiments show that the technique, with far less memory, delivers results faster than the three non-blocking join algorithms ( XJoin, HMJ and RPJ ) , with up to almost two-fold improvement in reliable network and one order of magnitude improvement in unreliable network in terms of the number of the reported tuples.In data streams or web scenarios at highly variable and unpredictable rates, a good join algorithm should be able to "hide" the delays by continuing to output join results. The non-blocking algorithms allow some tuples to be flushed onto disk, with the goal of producing results continuously when data transmission is suspended. But state-of-the-art algorithms have trouble with the constraint of allocated memory. To make better use of memory, a novel non-blocking join algorithm based on hash-merge for improving query response times is proposed. The reduced data structure of in-memory tuples helps to improve memory utility. A replacement selection tree is applied to adjust memory by expanding or shrinking the size of the tree and separates one external join transaction into multi-subtasks. In addition, a cost model to estimate task output rate is proposed to select the in-disk portion that promises to produce the fastest results in the external join stage. Experiments show that the technique, with far less memory, delivers results faster than the three non-blocking join algorithms ( XJoin, HMJ and RPJ ) , with up to almost two-fold improvement in reliable network and one order of magnitude improvement in unreliable network in terms of the number of the reported tuples.
关 键 词:Hash-merge NON-BLOCKING Replacement selection tree
分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...