Efficient processing of ordered XML twig pattern matching based on extended Dewey  被引量:1

Efficient processing of ordered XML twig pattern matching based on extended Dewey

在线阅读下载全文

作  者:Jin-hua JIANG Ke CHEN Xiao-yan LI Gang CHEN Li-dan SHOU 

机构地区:[1]School of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China

出  处:《Journal of Zhejiang University-Science A(Applied Physics & Engineering)》2009年第12期1769-1783,共15页浙江大学学报(英文版)A辑(应用物理与工程)

基  金:Project supported by the National Natural Science Foundation of China (Nos 60603044 and 60803003);the Program for the Changjiang Scholars and Innovative Research Team in University (No IRT0652);the Key Technology Projects of Zhejiang Province, China (No. 2006c11108)

摘  要:Finding all occurrences of a twig pattern is a core operation of extensible markup language (XML) query processing. Holistic twig join algorithms, which avoid a large number of intermediate results, represent the state-of-the-art algorithms. However, ordered XML twig join is mentioned rarely in the literature and previous algorithms developed in attempts to solve the problem of ordered twig pattern (OTP) matching have poor performance. In this paper, we first propose a novel children linked stacks encoding scheme to represent compactly the partial ordered twig join results. Based on this encoding scheme and extended Dewey, we design a novel holistic OTP matching algorithm, called OTJFast, which needs only to access the labels of the leaf query nodes. Furthermore, we propose a new algorithm, named OTJFaster, incorporating three effective optimization rules to avoid unnecessary computations. This works well on available indices (such as B+-tree), skipping useless elements. Thus, not only is disk access reduced greatly, but also many unnecessary computations are avoided. Finally, our extensive experiments over both real and synthetic datasets indicate that our algorithms are superior to previous approaches.Finding all occurrences of a twig pattern is a core operation of extensible markup language (XML) query processing. Holistic twig join algorithms, which avoid a large number of intermediate results, represent the state-of-the-art algorithms. However, ordered XML twig join is mentioned rarely in the literature and previous algorithms developed in attempts to solve the problem of ordered twig pattern (OTP) matching have poor performance. In this paper, we first propose a novel children linked stacks encoding scheme to represent compactly the partial ordered twig join results. Based on this encoding scheme and extended Dewey, we design a novel holistic OTP matching algorithm, called OTJFast, which needs only to access the labels of the leaf query nodes. Furthermore, we propose a new algorithm, named OTJFaster, incorporating three effective optimization rules to avoid unnecessary computations. This works well on available indices (such as B^-tree), skipping useless elements. Thus, not only is disk access reduced greatly, but also many unnecessary computations are avoided. Finally, our extensive experiments over both real and synthetic datasets indicate that our algorithms are superior to previous approaches.

关 键 词:XML querying Ordered twig join Index Optimization 

分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论] TP393.08[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象