面向内存数据库的类字典树索引综述与性能比较  被引量:2

Review and Performance Comparison of Trie-like Indexes for Main-Memory Databases

在线阅读下载全文

作  者:储召乐 罗永平 金培权[1,2] CHU Zhao-Le;LUO Yong-Ping;JIN Pei-Quan(School of Computer Science and Technology,University of Science and Technology of China,Hefei 230027;Key Laboratory of Electromagnetic Space Information,Chinese Academy of Sciences,Hefei 230027)

机构地区:[1]中国科学技术大学计算机科学与技术学院,合肥230027 [2]中国科学院电磁空间信息重点实验室,合肥230027

出  处:《计算机学报》2024年第9期2009-2034,共26页Chinese Journal of Computers

基  金:国家自然科学基金面上项目“面向异构混合内存的NVM感知索引及自适应学习方法研究”(No.62072419)的资助.

摘  要:如何快速存取海量数据是大数据时代数据库系统面临的重大挑战.利用大内存构建内存数据库系统是实现大数据实时存取的可行途径.在此背景下,用于加速内存数据存取的内存数据库索引成为近几年国内外的研究热点.但是,内存数据库索引也面临着诸多挑战.以常见的内存B+树索引为例,第一个问题是索引的空间效率较低,这是因为内存B+树索引的节点内部存在较大的空间浪费;第二个问题是索引的查询复杂度较高,B+树的查询复杂度受限于数据规模,随着数据规模的扩张,索引的搜索效率也会下降;第三个问题是变长数据支持弱,B+树对于变长键的支持比较差,往往难以适应实际应用的需要.近年来,由于字典树具有空间代价低、查询效率与数据规模无关、支持变长键等优点,逐步成为了内存数据库索引研究中的一个主要方向.本论文围绕面向内存数据库的类字典树索引,首先介绍了字典树的概念、特点和历史,然后系统梳理和总结了类字典树索引的现状和最新进展,之后提出了一种全新的分类方法对类字典树索引进行了分类.在此基础上,论文对主流的六种类字典树索引进行了实验,在多个数据集和负载上进行了性能对比,并基于实验结果讨论了类字典树索引的设计和使用建议,最后展望了未来类字典树索引的发展方向.Accelerating query performance over massive data has become a major challenge in database systems in the big data era.Building main-memory database systems on top of large memory has been a feasible way to support real-time big data access.Accordingly,main-memory indexes for accelerating data access have been a research focus in recent years.However,mainmemory indexes face a few challenges.Taking the main-memory B+tree as an example,the first issue is its low space efficiency stemming from the space overheads incurred by the internal nodes.Second,the B+tree’s query complexity directly correlates with data size,leading to diminishing search efficiency as dataset volumes escalate.Third,the B+tree cannot deal with variable-length keys efficiently,failing to meet the real requirements of applications.In recent years,trie-like indexes(also called trie indexes)have emerged as a focal point in main-memory index research due to their advantages of low space cost,high data-scale-independent query efficiency,and supporting variable-length keys.This paper focuses on trie-like indexes for main-memory databases.We first introduce some foundational aspects of trie indexes,including their concept,characteristics,and history.Then,we summarize the current achievements and advances of trielike indexes.Next,we provide a new taxonomy to classify trie-like indexes,based on which we implement six prominent trie-like indexes and evaluate their performance on various datasets and workloads.Further,we present some suggestions for the design and use of trie-like indexes based on the experimental results.Finally,we discuss the future development of trie-like indexes.

关 键 词:内存数据库 字典树索引 性能对比 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象