检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:武凡 韩京宇[1,2] 刘阳 李彩云 缪祝青 王彦之 毛毅 WU Fan;HAN Jingyu;LIU Yang;LI Caiyun;MIAO Zhuqing;WANG Yanzhi;MAO Yi(School of Computer Science and Technology,Nanjing University of Posts and Telecommunications,Nanjing 210023,China;Jiangsu Key Laboratory of Big Data Security and Intelligent Processing,Nanjing 210023,China)
机构地区:[1]南京邮电大学计算机学院,南京210023 [2]江苏省大数据安全与智能处理重点实验室,南京210023
出 处:《小型微型计算机系统》2024年第9期2055-2062,共8页Journal of Chinese Computer Systems
基 金:国家自然科学基金项目(62002174)资助.
摘 要:处理多维数据查询时,为了减少存储消耗,采用学习型索引替代传统索引受到关注.轨迹点会在时间或者空间维度上的某些区间聚集,数据分布倾斜,从而扭曲学习模型预测精度,导致较高的磁盘访问次数.提出一种基于分段线性回归树的轨迹索引,以降低存储代价并减少磁盘访问次数,包括数据排序和模型训练两个阶段.在第一个阶段,沿着时间维度划分轨迹点以形成一系列时空子区域,在每个时空子区域根据映射函数对轨迹点进行空间维度的存储,从而确定轨迹点的全局序号.在第二个阶段,使用初始数据构建分段线性回归树作为预测模型,并基于该模型预测位置来存储未来数据.模拟和真实的数据集上的实验表明,该方法在保证查询性能优于学习型索引的前提下,存储消耗和构建时间大幅度降低.In order to reduce storage consumption,learned index is used to replace traditional index when processing multidimensional data query.Trajectory points will converge in some intervals of time or space dimension,and the data distribution will be skewed,thus distorting the prediction accuracy of the learning model and leading to high disk access times.We propose a trajectory index based on Piecewise Linear Regression Tree(PLRT)to reduce storage cost and disk access times,including data sorting and model training two stages.In the first stage,trajectory points are divided along the time dimension to form a series of time gap subregions.In each time gap subregion,the spatial dimension of trajectory points is stored according to the mapping function,so as to determine the global sequence number of trajectory points.In the second stage,the initial data is used to build a piecewise linear regression tree as a prediction model,based on which the predicted location is used to store future data.Experiments on simulated and real data sets show that this method can reduce the memory consumption and construction time significantly while ensuring the query performance is better than that of the learned index.
关 键 词:轨迹点 学习型索引 分段线性回归树 范围查询 点查询
分 类 号:TP392[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49