RB树:一种支持空间近似关键字查询的外存索引  被引量:9

An Index Supporting Spatial Approximate Keyword Search on Disks

在线阅读下载全文

作  者:王金宝[1] 高宏[1] 李建中[1] 杨东华[2] 

机构地区:[1]哈尔滨工业大学计算机科学与技术学院,哈尔滨150001 [2]哈尔滨工业大学基础与交叉科学研究院高性能计算中心,哈尔滨150001

出  处:《计算机研究与发展》2012年第10期2142-2152,共11页Journal of Computer Research and Development

基  金:国家"九七三"重点基础研究发展计划基金项目(2006CB303005);国家自然科学基金项目(60903016;60533110;60773063);教育部新世纪优秀人才支持计划基金项目(NCET-05-0333);黑龙江省教育厅科学技术研究项目(11531276)

摘  要:空间近似关键字查询包含一个空间条件和一组关键字相似性条件,这种查询在空间数据库中返回同时满足以下条件的对象:1)对象的位置信息满足查询中的空间条件;2)对于查询中的任何一个关键字,对象中至少包含一个关键字与其相似度大于给定阈值.随着当前数据的爆炸性增长,空间数据库无法完整地存放在内存中,因此空间数据库需要支持空间近似关键字查询的外存索引.目前,还没有在外存中支持精确的空间近似关键字查询的索引结构.设计了一种新型的外存索引RB树,在外存中支持精确的空间近似关键字查询.RB树支持的空间近似关键字查询包括多种空间条件,如范围查询、NN查询,同时支持多种关键字相似性度量,包括编辑距离、规范化编辑距离等.通过真实数据中的性能测试验证了RB树的效率.Spatial approximate keyword queries consist of a spatial condition and a set of keywords as the textual condition. The spatial condition requires that the returned objects are inside a spatial region or nearby a location, and the textual condition requires that the returned objects are labeled with a set of keywords similar to the queried keywords. Such queries enable users to find objects they are interested in within a spatial database, and make mismatches between users' query keywords and spatial object keywords tolerant. With the rapid growth of data, spatial databases storing objects from diverse geographical regions can be no longer held in memories. Thus, it is essential to answer spatial approximate keyword queries over disk resident datasets. Existing works present methods either that return incomplete answers or index in memory, and effective solutions in disks are in demand. This paper presents a novel disk resident index RB-tree to support spatial approximate keyword queries. We study the principle of augmenting R-tree with the capacity of approximate keyword searching based on existing solutions, and store two kinds of bitmaps in R-tree nodes to build an RB-tree. RB- tree supports a wide range of spatial conditions such as range and nearest neighbor, combined with keyword similarity metrics such as edit distance, dice etc. Experimental results against R-tree on two real world datasets demonstrate the efficiency of our solution.

关 键 词:空间数据库 关键字 查询处理 索引 外存 

分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象