去重环境下基于元数据分类的贪婪预取型数据恢复  被引量:3

Separated Metadata Based Greedy Prefetching Data Restore in Deduplication Environment

在线阅读下载全文

作  者:杨儒[1] 邓玉辉[1,2] 魏文国[3] 

机构地区:[1]暨南大学信息科学技术学院计算机科学系,广州510632 [2]中国科学院计算技术研究所计算机体系结构国家重点实验室,北京100190 [3]广东技术师范学院电子与信息学院,广州510665

出  处:《小型微型计算机系统》2017年第5期930-935,共6页Journal of Chinese Computer Systems

基  金:国家自然科学基金项目(61572232;61272073)资助;广东省自然科学基金重点项目(S2013020012865)资助;中国科学院计算机系统结构国家重点实验室开放基金项目(CARCH201401)资助;中央高校基本科研业务费专项资金;广东省公益研究与能力建设专项资金项目(2014A010103032)资助

摘  要:数据备份的目的就是为了恢复.由于逻辑连续的数据被物理分散在不同的磁盘位置,传统的去重方法产生的碎片严重影响了系统的恢复性能.之前存在的一些优化方法都是尝试在备份时利用重写算法来改进后期的恢复性能,然而重写算法自身存在的弊端就是必须以牺牲去重率的代价来获得更好的数据恢复性能,最终导致浪费磁盘空间的结果.并且传统的方法在备份完成后只会生成一份备份元数据以便后期恢复,导致系统在恢复过程中频繁地低效率访问磁盘上的元数据.本文提出将备份元数据分类组织成文件元数据和块元数据,在不牺牲系统去重率和充分利用硬件资源的前提下,通过对元数据更加激进式的预取来有效提高恢复性能和吞吐量.关于本文系统的恢复性能的实验评估是基于真实的数据集,实验表明:相比基于历史感知和基于内容的重写算法所获得的恢复性能,基于元数据分类的数据恢复分别在平均节约了1.91%和4.36%的去重率的情况下,获得了27.2%和29.3%的恢复性能提升.Restore data is the main purpose of data backup. As the logical continuous data is physically scattered in different disk locations, the fragment generated by the traditional deduplication seriously affected the restore performance of the system. Although some existing optimization methods are attempts to use rewriting algorithm to improve the later restore speed, the rewriting algorithms itself have disadvantage which must sacrifice deduplication ratio to gain better restore preformance ,resulting in the waste of disk space. Furthermore the traditional method will only generate a copy of metadata for the later restore after backuping data, it would causes the system to access the metadata on disk frequently and inefficiently. This paper proposes a method that separate backup metadata into file metadata and chunk metadata, and more radically prefetching metadata to increase the restore throughput effectively without giving up the deduplication ratio and making full use of hardware resources. The experimental assessment of restore performance based on real data sets shows that with the average saving of 1.91% and 4.36% deduplication ratio, separated metadata based restore algorithm obtains 27.2% and 29.3% performance enhancement respectively compared to the restore performance which is gained by history-aware and context-based rewriting.

关 键 词:存储 重复数据删除 重写 元数据预取 恢复 

分 类 号:TP309[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象