基于日志结构合并树的轻量级分布式索引实现方法  被引量:2

Implementation method of lightweight distributed index based on log structured merge-tree

在线阅读下载全文

作  者:崔双双[1] 王宏志[1] CUI Shuangshuang;WANG Hongzhi(Faculty of Computer Science,Harbin Institute of Technology,Harbin Heilongjiang 150001,China)

机构地区:[1]哈尔滨工业大学计算学部,哈尔滨150001

出  处:《计算机应用》2021年第3期630-635,共6页journal of Computer Applications

基  金:国家重点研发计划项目(2018YFB1004700);国家自然科学基金资助项目(U1866602,61602129,61772157)。

摘  要:针对现有基于日志结构合并树(LSM-Tree)实现的分布式数据库仅支持高效的主键查询,无法让用户快速地应用在自己的集群中的问题,提出了基于LSM-Tree的轻量级分布式索引实现方法 SIBL。首先,通过对主键属性列建立索引来提高非主键属性的查询效率;然后,提出了分布式索引构建算法以及基于等距取样的索引区间划分算法,从而保证了索引在系统中的均匀分布,并且优化了传统索引的查询算法,将索引文件看作特殊的数据文件分布式地存储在系统中,从而保证了系统的负载均衡和可扩展性;最后,将该方法与华为二级索引方案HIndex在HBase数据库上进行实验来比较二者的索引构建的时间和空间开销、索引的查询性能和系统的负载均衡等性能,验证得出所提出的方法使查询性能提升了50~200倍。To solve the problem that the existing distributed database based on Log Structured Merge-Tree(LSM-Tree)only supports efficient primary key query and cannot allow users to quickly apply it in their own clusters,a light-weight distributed index implementation method based on LSM-Tree,called SIBL(Secondary Index Based LSM-Tree),was proposed.Firstly,the query efficiency of the non-primary key attributes was improved by indexing the primary key attribute columns.Then,a distributed index construction algorithm and an index interval division algorithm based on equidistant sampling were proposed to ensure the even distribution of indexes in the system.And the query algorithm of the traditional index was optimized,and the index file was regarded as a special data file and stored in the system in a distributed manner,ensuring the load balance and scalability of the system.Finally,experiments of the proposed method with Huawei’s secondary index scheme HIndex were carried out on the HBase database to compare performances such as time and space overhead of index construction,index query performance and system load balance,verifying that the proposed method improves query performance by 50 to 200 times.

关 键 词:日志结构合并树 分布式索引 HBASE 查询优化 

分 类 号:TP312[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象