基于birch聚类的可更新机器学习索引模型  被引量:1

Updatable machine learning indexing model based on birch clustering

在线阅读下载全文

作  者:曹卫东[1] 金超 CAO Wei-dong;JIN Chao(School of Computer Science and Technology,Civil Aviation University of China,Tianjin 300300,China)

机构地区:[1]中国民航大学计算机科学与技术学院,天津300300

出  处:《计算机工程与设计》2023年第11期3328-3334,共7页Computer Engineering and Design

基  金:国家自然科学联合基金项目(U1833114)。

摘  要:为满足大数据时代下数据库系统高吞吐、低内存占用的索引设计需求,提出一种面向海量数据的基于birch聚类可更新机器学习索引模型。将数据集使用birch聚类进行划分,对分段数据分别使用前馈神经网络进行训练拟合,采用基于日志结构合并树延迟更新思路的异地插入策略,实现索引更新操作。使用真实数据集进行实验,其结果表明,相比传统索引和当前先进机器学习索引结构,该模型在检索速度上有一定提升,在插入性能、内存占用和训练时间上有较大优化。To meet the index design requirements of database systems with high throughput and low memory consumption in the era of big data,an updatable machine learning index model based on birch clustering for massive data was proposed.The data set was divided by birch clustering,and the segmented data were trained and fitted through feedforward neural network.The remote insertion strategy based on the idea of delayed update of log structure merging tree was adopted to realize the index update operation.Using real data sets,the results show that compared with the traditional index and the current advanced machine learning index structure,the model not only improves the retrieval speed,but optimizes the insertion performance,memory occupation and training time.

关 键 词:海量数据 机器学习 索引设计 聚类 日志结构合并树 数据访问热度 动态更新 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象