海量小文件系统中基于聚合单元的空间回收机制  被引量:1

Space reclamation based on aggregation unit in mass small file system

在线阅读下载全文

作  者:徐俊[1] 何连跃[1,2] 严巍巍 陈博 徐照淼[1] XU Jun;HE Lianyue;YAN Weiwei;CHEN Bo;XU Zhaomiao(College of Computer,National University of Defense Technology,Changsha Hunan 410073,China;Belting Netclouds Information Technology Company Limited,Beijing 100067,China)

机构地区:[1]国防科技大学计算机学院,长沙410073 [2]北京网云飞信息技术有限公司,北京100067

出  处:《计算机应用》2018年第A01期108-111,共4页journal of Computer Applications

摘  要:由于开源分布式文件系统HDFS不支持随机读写,基于HDFS实现的分布海量小文件系统SMDFS只支持聚合空间的删除,但不支持文件粒度的删除。在分析SMDFS文件删除行为基础上,采用元数据实时删除和事后存储空间碎片整理结合的删除思路。由于SMDFS不支持从数据文件获取其中的未删除小文件,提出聚合单元的概念,每个数据文件对应一个聚合单元,借此可获得数据文件中所有的未删除的小文件和存储空间碎片。通过迁移未删除小文件和删除整个数据文件,实现存储空间碎片整理。设计了Master-Worker分布式空间回收程序框架,实现了SMDFS的文件删除功能。测试表明,支持文件删除的海量小文件系统SMDFS2. 1与原有的SMDFS2. 0相比,文件的读写性能没有明显下降;碎片整理时系统写性能降低30%,系统读性能降低18%。For that open source distributed file system Hadoop Distributed File System (HDFS) does not support random access, a HDFS-based distributed mass small file system named SMDFS supports the deletion of aggregation space, but does not support the deletion of file. Based on the analysis of file deletion of SMDFS, the idea of deleting metadata in real-time and realizing storage space defragmentation afterward was adopted to aceomplish file deletion. For that SMDFS does not support getting undeleted small files from the data file, the eoncept of aggregation unit was proposed. Eaeh data file corresponded to an aggregation units. All the undeleted small files and all the storage space fragments in the data file could be acquired through the aggregation unit. By migrating undeleted small files to other data files and then deleting the data file, storage space defragmentation was realized. And then Master-Worker distributed storage space reclamation program fi'amework was designed, and the file deletion function in SMDFS was implemented. Experimental results show that compared with SMDFS2.0, the operation performance of reading and writing of SMDFS2.1, which supports the deletion of files, is not significantly deelined. When data files were defragmented, the write performanee was redueed by 30%, the read performanee was reduced by 18%.

关 键 词:海量小文件系统 HADOOP分布式文件系统 聚合单元 空间回收 SMDFS 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象