半结构数据的存储模型和查询执行  被引量:3

Storage Models and Query Execution of Semi-Structured Data

在线阅读下载全文

作  者:冯建华[1] 王钦克[1] 周立柱[1] 孟宪虎[2] 

机构地区:[1]清华大学计算机科学与技术系,北京100084 [2]运城高等专科学校计算机系,运城044000

出  处:《计算机科学》2002年第10期6-10,共5页Computer Science

基  金:国家"973"重点基础研究发展规划项目(G1998030414)的支持

摘  要:1引言 半结构数据是指区别于语音和图像文件等"原始数据",具有一定程度的结构,又不像传统的数据库系统那样存在严格模式的数据[1.2].半结构数据广泛存在于各种电子数据源,特别是Internet当中.以WWW为例,其HTML文件格式本身就是由标签和锚点等结构单元组成的,因此文件中的数据常常具有明显的结构.但同时效据的结构又非常不规范,不符合传统效据库的要求,因此不能简单地应用现有的数据库技术和工具对其进行处理,需要研究和开发对半结构数据进行描述和处理的新技术、新工具.Semi-structured data are generally modeled as labeled graphs. Data in such models are self-describing and dynamically typed, and capture both schema and data information. Such models, although flexible, evoke severe efficiency penalties compared to querying structured database, such as relational databases. In order to improve the efficiency of data manipulation by utilizing structure information, we present a hybrid method capable of reorganizing semi-structured data on the basis of their structural degrees. The method extracts data with high degrees of structure and stores them in relations while leaves the rest part in its original graph form. This paper gives the algorithms for generating and dynamic updating storage model of the method, illustrates how queries could be executed based on the storage model and analyzes its improvement in processing queries, comparing with common execution methods. It also gives an algorithm that converts queries on semi-structured data to relational calculus, which provides a way to utilize query optimization techniques in relational database systems.

关 键 词:半结构数据 存储模型 数据模式 数据库系统 数据查询执行技术 

分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象