基于HBase的多维索引查询机制的优化  被引量:12

Optimization of multidimensional index query mechanism based on HBase

在线阅读下载全文

作  者:徐江峰[1] 谭玉龙 XU Jiangfeng;TAN Yulong(School of Information Engineering,Zhengzhou University,Zhengzhou Henan 450001,China)

机构地区:[1]郑州大学信息工程学院

出  处:《计算机应用》2020年第2期571-577,共7页journal of Computer Applications

基  金:中央高校基本科研业务费专项资助项目(20190605)~~

摘  要:键值存储旨在从非常大的数据量中提取值,同时具有高可用性、容错性和可伸缩性,因此提供了非常需要的基础设施来支持基于位置的服务(LBS)。然而,多维数据上的复杂查询不能有效地处理,因为键值存储不提供访问多个属性的方法。针对键值存储HBase不能有效处理多维数据的问题,提出了一个统一的索引框架--New-grid,使键值存储HBase支持多维查询。在改进的P-grid覆盖网络中,组织了一组节点,提供了高效的数据分布、容错和多维数据的查询处理。为了进行索引,使用基于Hilbert空间填充曲线来保存数据的局部性,从而有效地管理键值存储中的多维数据。同时使用HBase底层存储管理数据,并提出了一种范围查询和K最近邻查询的算法,以消除维护单独索引表的开销。在Amazon EC2上使用4、8和16个普通节点的集群进行了广泛的实验。实验结果表明,New-grid的性能相比MD-Hbase以及MapReduce更优。The key value store is designed to extract values from very large amounts of data and is highly available,faulttolerant,and scalable,providing a much needed infrastructure to support Location-Based Service(LBS).However,complex queries on multidimensional data cannot be processed effectively because the key value store does not provide a way to access multiple properties.For the key value storage,HBase cannot effectively deal with the problem of multidimensional data,a uniform indexing framework named New-grid was proposed.In the improved P-grid coverage network,a group of nodes was organized to provide efficient data distribution,fault tolerance and multi-dimensional data query processing.For indexing purposes,the locality of data storage based on Hilbert space filling curves was used to effectively manage the multidimensional data in the key value store.Simultaneously,HBase underlying storage was used to manage data,and an algorithm of range query and K-Nearest Neighbors(KNN)query were given to eliminate the overhead of maintaining separate index tables.Extensive experiments were conducted on Amazon EC2 using cluster sizes of 4,8 and 16 normal nodes.Experimental results show that New-grid performance is more optimized than MD-HBase and MapReduce.

关 键 词:基于位置的服务 多维索引 HBASE 空间填充曲线 覆盖网络 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象