机构地区:[1]重庆大学计算机学院,重庆400044 [2]重庆大学通信工程学院,重庆400044 [3]华东师范大学计算机科学与软件工程学院,上海200062
出 处:《计算机学报》2019年第11期2417-2428,共12页Chinese Journal of Computers
基 金:国家“八六三”高技术研究发展计划项目基金(2015AA015304);国家自然科学基金(61472052);中国博士后科学基金(2017M620412)资助~~
摘 要:多表连接操作是嵌入式数据库、数据仓库等系统中的一个重要操作.因此,提升多表连接的性能能够加快数据处理和分析的速度,进而提升系统的整体性能.新型的非易失性存储器(Non VolatileMemory,NVM)具有内存级读写速度、存储密度高、可字节寻址和持久化等优点,成为补充或替代DRAM的新型存储设备.然而,直接将现有的多表连接算法应用在NVM上会带来两个问题:(1)现有算法不能充分发挥新型非易失性存储器的优势,无法展现较优的性能;(2)连接算法会生成大量中间表,对存储设备造成大量写操作.由于NVM的写耐受度有限,现有多表连接操作极易造成NVM的损坏.该文考虑NVM写耐受度有限的特性,旨在减少多表连接操作引起的对NVM的写操作.首先,该文提出优化连接顺序的NVjoin算法,该算法解析不同表之间的关联性,并通过采样的方法估算中间结果的大小,从而选择较优的连接顺序,尽可能减少NVM上的写操作.其次,该文设计了一个组织中间结果的数据结构LWTab,该结构充分利用了NVM可字节寻址的特性,通过存储数据的地址而非数据的方式,进一步减少连接过程中中间结果所产生的NVM写操作.该文利用DRAM模拟NVM进行大量的测试实验,结果表明,该文提出的算法在时间性能与NVM写次数两个方面均得到提升:与MySQL所提供的连接顺序相比,NVjoin可以减少104.21倍的NVM写操作并提升65.01%的性能.除此之外,LWTab可以在NVjoin的基础上,进一步减少16.74倍的NVM写操作以及提升71.86%的性能.Multi-table join operation is one of the most important operations in lots of systems,such as embedded databases and data warehouses.Hence improving the performance of Multi-table join operation can accelerate the speed of data processing and analysis.The overall performance of the systems can be also enhanced.Emerging non-volatile memories(NVMs)have been widely discussed to complement or substitute DRAM in memory hierarchy in recent years.When the NVM is incorporated into memory hierarchy,it has fast read/write speed,high storage density,byte addressability,low power consumption and persistency.However,employing existing multi-table join algorithms directly on NVM incurs two big problems:(1)existing algorithms cannot fully exploit the benefits of NVM which may stand in the way of improving the performance of systems.(2)Existing algorithms produce large intermediate table size and a large number of copies of data,which causes a lot of NVM writes to the storage device.Since NVM suffers from the limitation of write endurance(i.e.,once the number of writes of a memory cell exceeds the endurance limit,the NVM chip is considered worn-out),existing multi-table join algorithms can easily make NVM device breakdown.In this paper,we consider the limited write endurance of NVM and aim to reduce the number of NVM writes caused by multi-table join algorithms.First,we propose an algorithm,NVjoin,to select an optimal order for joining multiple tables.NVjoin algorithm analyzes the relation between tables according to the joining statements entered by user.And then estimate the size of the intermediate tables using sampling techniques.In this way,we can obtain an optimal order so as to reduce the number of NVM writes.Second,a new data structure,LWTab,is designed to keep the intermediate results of multi-table join algorithms.The new structure makes full use of the byte-addressability of NVM.Specifically,instead of storing data,we store the address of data in LWTab,which can significantly reduce the number of copies of data and th
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...