检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王意洁[1,2] 许方亮[1,2] 裴晓强[1,2]
机构地区:[1]国防科学技术大学并行与分布处理国家重点实验室,长沙410073 [2]国防科学技术大学计算机学院,长沙410073
出 处:《计算机学报》2017年第1期236-255,共20页Chinese Journal of Computers
基 金:国家自然科学基金(61379052);国家重点研发计划项目(2016YFB1000101);国家"八六三"高技术研究发展计划项目(2013AA01A213);湖南省自然科学杰出青年基金项目(14JJ1026);高等学校博士学科点专项科研基金资助课题(20124307110015)资助~~
摘 要:大数据规模上体量大和增长速度快的特点对存储系统的性能和可扩展性提出了严峻挑战.使用普通商用服务器构建的分布式存储系统服务能力强、成本低廉且极易扩展,在大数据的存储管理中得到了极为广泛的应用.分布式存储系统庞大的节点数量导致节点失效情况频发,必须采用一定的容错技术来保证数据可靠性.常用的容错技术主要包括多副本技术和纠删码技术两种.与多副本容错技术相比,纠删码容错技术能够以低得多的存储开销提供相同甚至更高的数据可靠性.随着近年来数据规模的爆炸式增长,纠删码容错技术受到了业界的广泛关注.该文综述了分布式存储中纠删码容错技术的研究现状.首先,介绍了纠删码容错技术的基本原理和概念,指出了纠删码容错技术在大规模分布式存储中面临的主要技术挑战;然后,从编码实现、纠删码设计、数据修复和数据更新等方面阐述了分布式存储中纠删码容错技术的研究进展,重点研究分析了各项关键技术的特点和局限性,并依据主要评价指标对现有纠删码的编码性能和修复性能进行了对比和分析;最后,基于最新研究动态指出了分布式存储中纠删容错技术未来的研究方向,包括同步编码实现技术、低冗余再生码设计和数据失效预测技术等.Storing and managing big data,whose volume is extremely large and keeps growing rapidly,is a big challenge.Distributed storage systems built from inexpensive commodity hardware,which are able to offer extremely high performance and high scalability with low economic cost,are widely used for storing and managing big data.However,the large amount of storage nodes in distributed storage systems makes node failures common in their daily operations.This makes it essential to introduce data redundancy so that data reliability is guaranteed.Replication and erasure coding are two common approaches used to protect data from node failures.Compared to replication,erasure coding incurs much lower storage overheads and can offer the same or even higher data reliability at the same time.For this reason,with the rapid growth of data,erasure coding has gained comprehensive attention recently.This paper summarizes the research status of erasure coding in distributed storage systems.Firstly,we introduce the basic idea and main concepts of erasure coding,and point out the main technical challenges of integrating erasure coding intolarge-scale distributed storage systems.Secondly,we provide a comparison and analysis of the latest research in the field from the aspects of data encoding technologies,design of erasure codes,data repair technologies,data update technologies and so on.We also provide a comprehensive comparison of common erasure codes from the aspects of data encoding and data repair.Finally,we point out some future work that can promote the further development of erasure coding in distributed storage systems,including synchronous data encoding,regenerating codes with low redundancy and data failure forecasting.
关 键 词:分布式存储 纠删码 编码实现 数据修复 数据更新
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.166