异构分布式存储系统再生码数据修复的节点选择方案  被引量:9

Node Selection Scheme for Data Repair in Heterogeneous Distributed Storage Systems

在线阅读下载全文

作  者:钟凤艳 王艳[1] 李念爽 ZHONG Feng-yan;WANG Yan;LI Nian-shuang(School of Software,East China Jiaotong University,Nanchang 330013,China)

机构地区:[1]华东交通大学软件学院

出  处:《计算机科学》2019年第8期35-41,共7页Computer Science

基  金:国家自然科学基金项目(61402172);江西省教育厅项目(GJJ150509);教育部人文社科基金(15YJA860013)资助

摘  要:近年来,海量数据的增长给现有的存储系统带来了严峻的挑战,包括存储成本和数据可靠性要求等。纠删码由于在相同的存储开销下可以提供更高的数据可靠性,得到了学术界和工业界的广泛关注。但由于纠删码的编码特性,让使用纠删码的存储系统在数据修复过程中增加了许多其他方面的额外开销,如计算、调度、传输、磁盘读写等。近年来对纠删码数据修复的研究都基于这样一个假定:分布式存储系统中各个节点是无差别的。然而,实际情况是,在大规模的数据中心中,设备替换、硬件故障等原因不仅会导致数据丢失,还会导致数据中心的各个存储节点的存储成本不同,从而使每个存储节点上所存储的数据量并不总是相等,这种现象被称为存储容量异构。存储容量异构场景下的修复过程面临供应节点的选择问题,需要设计一个节点选择策略来降低修复开销,提高存储系统的可靠性和可用性。鉴于实际数据修复过程中参与修复的节点对数据的传输成本不同,提出节点选择策略——树形拓扑修复算法,以降低整个修复过程中的修复成本。仿真结果表明,相对IFR码的固定节点选择策略,文中提出的树形选择策略在平均情况下可以进一步降低数据修复成本。In recent years,the growth of massive data poses severe challenges to existing storage systems,including storage cost and data reliability requirements.Because erasure code can provide higher data reliability under the same storage overhead,it has been paid wide attention.However,the coding characteristics of erasure code increases the extra overhead for the storage system using erasure code,in the process of data repair,such as computing,scheduling,transmission,disk reading and writing,and so on.In recent years,the study of erasure code data recovery is based on the assumption that each node in the distributed storage system is indiscriminate.In a large scale of data center,however,equipment replacement,hardware failure and other reasons may not only cause data loss,but also lead to different sto-rage cost of each storage node in the data center,so that the amount of data stored on each storage node is not always the same,this phenomenon is called storage capacity isomerism.The repair process under the heterogeneous storage capacity is faced with the selection of the providers.It is necessary to design a node selection strategy to make the repair cost lower,and improve the reliability and availability of the storage system.Based on the different transformation cost of nodes participating in repair in the actual repair process of data,this paper proposed a node selection strategy,namely tree topology repair algorithm,to reduce the cost of repair in the whole repair process.The simulation results show that the proposed tree selection strategy can further reduce the cost of data repair compared with the fixed node selection strategy of IFR code.

关 键 词:分布式存储系统 节点异构 再生码 数据修复 

分 类 号:TP309.3[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象