检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:谢平[1,2]
机构地区:[1]青海师范大学计算机学院,西宁810008 [2]华中科技大学计算机科学与技术学院,武汉430074
出 处:《计算机科学》2014年第1期22-30,42,共10页Computer Science
基 金:国家973重点基础研究发展计划(2011CB302303)资助
摘 要:目前企业对数据量不断增长的需求使得数据中心面临严峻的挑战。研究发现,存储系统中高达60%的数据是冗余的,如何缩减存储系统中的冗余数据受到越来越多科研人员的关注。重复数据删除技术利用CPU计算资源,通过数据块指纹对比能够有效地减少数据存储空间,已成为工业界和学术界研究的热点。在分析和总结近10年重复数据删除技术文献后,首先通过分析卷级重删系统体系结构,阐述了重删系统的原理、实现机制和评价标准。然后结合数据规模行为对重删系统性能的影响,重点分析和总结了重删系统的各种性能改进技术。最后对各种应用场景的重删系统进行对比分析,给出了4个需要重点研究的方向,包括基于主存储环境的重删方案、基于分布式集群环境的重删方案、快速指纹查询优化技术以及智能数据检测技术。With the ever-increasing data volume in enterprises, the needs of massive data storage capacity currently be- come a grand challenge in data centers, and researching shows that there are about 60% redundant data in storage sys- tems. Therefore,the problems of high redundancy in data storage systems are paid much more attentions by resear- chers. Exploiting CPU resource to compare the data block's fingerprint which is unique, data deduplication techniques can efficiently accomplish data reduction in storage systems, thus data deduplication techniques have become a hot topic in both industry and academia fields. Based on adequately analyzing and summarizing literatures on data deduplication techniques appeared in recent ten years,this paper first presented the principle of representative data deduplication sys- tems, implementation mechanisms as well as evaluation methodologies after analyzing volume-level data deduplication system architecture. Second, we also focused on existing deduplication optimizing techniques with consideration of both the characteristics of data and scale of data deduplication systems. Finally four new research directions were given as follows by comparatively analyzing various application scenarios of data deduplication systems, including research of pri- mary-Storage-Level data deduplication approaches, research of distributed data deduplication scheme for clustered stor- age systems, research of highly-efficient fingerprint searching techniques and research of intelligent data detection tech- niques.
关 键 词:重复数据删除 重删率 体系结构 元数据结构 I O优化
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.4