检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:岑黎彬 李靖东 林淳波 王晓玲[1] CEN Libin;LI Jingdong;LIN Chunbo;WANG Xiaoling(School of Computer Science and Technology,East China Normal University,Shanghai 200062,China;Gauss Database Labs,Huawei Technologies Company Limited,Shanghai 201206,China)
机构地区:[1]华东师范大学计算机科学与技术学院,上海200062 [2]华为技术有限公司高斯实验室,上海201206
出 处:《计算机应用》2023年第7期2034-2039,共6页journal of Computer Applications
基 金:国家自然科学基金重点项目(62136002);上海市科委重点项目(20DZ1100300)。
摘 要:聚合函数的近似查询处理(AQP)是近年来数据库领域的研究热点。针对现有的近似查询技术存在查询响应时间长、存储开销大、不支持多谓词查询等问题,提出一种基于深度自回归模型的AQP方法DeepAQP(Deep Approximate Query Processing),利用深度自回归模型对表中多列数据的联合概率分布进行学习和建模,以估计给定查询的谓词选择度和目标列概率分布,以促进单表下多谓词聚合函数近似查询请求的有效处理。在TPC-H和TPC-DS数据集上进行实验,结果表明,与基于采样的VerdictDB方法相比,DeepAQP在查询响应时间和存储空间开销上均降低了2到3个数量级;与基于传统机器学习模型的DBEst++方法相比,DeepAQP的查询响应时间降低了1个数量级,显著降低了模型训练耗时,并且可以处理DBEst++所不支持的多谓词查询请求。可见,DeepAQP兼顾了查询精度和速度,并显著降低了算法在训练和存储上的开销。Recently,Approximate Query Processing(AQP)of aggregate functions is a research hotspot in the database field.Existing approximate query techniques have problems such as high query response time cost,high storage overhead,and no support for multi-predicate queries.Thus,a deep autoregressive model-based AQP approach DeepAQP(Deep Approximate Query Processing)was proposed.DeepAQP leveraged deep autoregressive model to learn the joint probability distribution of multi-column data in the table in order to estimate the selectivity and the target column’s probability distribution of the given query,enhancing the ability to handle the approximate query requests of aggregation functions with multiple predicates in a single table.Experiments were conducted on TPC-H and TPC-DS datasets.The results show that compared with VerdictDB,which is a sample-based method,DeepAQP has the query response time reduced by 2 to 3 orders of magnitude,and the storage space reduced by 3 orders of magnitude;compared with DBEst++,which is a machine learning-based method,DeepAQP has the query response time reduced by 1 order of magnitude and the model training time reduced significantly.Besides,DeepAQP can handle with multi-predicate query requests,for which DBEst++does not support.It can be seen that DeepAQP achieves good accuracy and speed at the same time and reduces the training and storage overhead of algorithm significantly.
关 键 词:近似查询处理 自回归模型 多谓词查询 深度学习 聚合函数
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.227.183.215