智能数据库学习型索引研究综述被引量：4

An Overview of Learned Index Technologies for Intelligent Database

作　　者：蔡盼张少敏刘沛然孙路明李翠平[1,2] 陈红[1,2] CAI Pan;ZHANG Shao-Min;LIU Pei-Ran;SUN Lu-Ming;LI Cui-Ping;CHEN Hong(Key Laboratory of Data Engineering and Knowledge Engineering of the Ministry of Education,Renmin University of China,Beijing 100872;School of Information,Renmin University of China,Beijing 100872)

机构地区：[1]中国人民大学数据工程与知识工程教育部重点实验室,北京100872 [2]中国人民大学信息学院,北京100872

出　　处：《计算机学报》2023年第1期51-69,共19页Chinese Journal of Computers

基　　金：国家自然科学基金(62072460,62076245,62172424,62276270);北京市自然科学基金(4212022);中国人民大学科学研究基金(中央高校基本科研业务费专项资金)(22XNH189)资助.

摘　　要：建立高效的索引结构是提升数据库存取性能的关键技术之一.在数据呈爆发式增长、海量聚集、高维复杂的大数据环境下,传统索引结构(例如B+树)处理海量数据时面临空间代价高、查询效率低、存取开销大等难题.学习型索引技术通过对底层数据分布、查询负载等特征进行建模和学习,有效的提升了索引性能,并减少了访存空间开销.本文从学习型索引技术的基础模型入手,对RMI基础模型实现原理、构造和查询过程进行了分析,并总结了基础模型的优点和存在的问题;以此为基础,按照索引结构特点对学习型索引技术进行分类,从索引创建方式和更新策略两方面对学习型索引技术进行了系统梳理,并对比分析了典型学习型索引技术的优点及不足之处.另外,本文总结了学习型索引技术的扩展研究.最后,对学习型索引的未来研究方向进行了展望.Data access is an important operation in the database system.It is critical to improve the performance of the database system by increasing the speed of data access.Therefore,both academia and the industry have been devoted to establishing efficient index structures to improve the performance of data access in database systems over the past decades.Nevertheless,the traditional index structures(e.g.,B+-tree)face the challenges of the high space cost,the low query efficiency,and the more access overhead in the era of big data,which is characterized by explosive data growth,massive aggregation,and high dimensional complexity.Consequently,to deal with the mentioned issues,machine learning methods are applied to the traditional index structure to advance the research of learned indexes,which gradually becomes one of the research hotspots in the database field.The core of the learned index is to approximate the cumulative distribution function of the underlying data through machine learning methods to realize the mapping between keys and record positions.During the query process,the learned index predicts the location of the record through the key.The tree traversal operation of the traditional index structures is replaced by model prediction,which effectively improves the query speed and reduces the storage overhead.Learned indexes provide new research ideas for improving the performance of traditional index structures.Until now,a considerable amount of literature has sought to investigate the learned index structures.Therefore,it is worthwhile to present the state-of-the-art of the learned index.This paper firstly introduces the research background of the learned index.Then,we present a review of current research and analyze the proposed methods in terms of the fundamental and extension issues.All learned indexes are associated with the Recursive Model Index(RMI)model as a clue.Specifically,the basic problems of the learned indexes,mainly include:(1)The basic RMI model that this paper systematically summarizes the im

关键词：机器学习学习型索引索引结构 RMI模型智能数据库

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

智能数据库学习型索引研究综述被引量：4

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

智能数据库学习型索引研究综述 被引量：4

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

智能数据库学习型索引研究综述被引量：4