三数据中心下的纠删码算法研究  

Study on Erasure Code Algorithm for Three Data Centers

在线阅读下载全文

作  者:孙婧[1] 牛虹婷 梁松涛 SUN Jing;NIU Hongting;LIANG Songtao(Department of Intelligent Science and Information Law,East China University of Political Science and Law,Shanghai 201620,China;School of Computer Science and Engineering,Beihang University,Beijing 100191,China;Multimedia Technology Center,Shanghai Bilibili Technology Co.,Ltd,Shanghai 200433,China)

机构地区:[1]华东政法大学智能科学与信息法学系,上海201620 [2]北京航空航天大学计算机学院,北京100191 [3]上海哔哩哔哩科技有限公司多媒体技术中心,上海200433

出  处:《计算机科学》2025年第2期48-57,共10页Computer Science

基  金:国家自然科学基金(12161080);数据恢复四川省重点实验室开放基金项目(DRN2204)。

摘  要:纠删码算法在单数据中心和多数据中心得到了广泛的应用。目前对纠删码算法的研究更多地关注存储成本和修复带宽,对于如何在专线带宽、交换机受限的情况下完成多数据中心之间的修复,如何在可靠性、容错能力等核心因素之间实现最佳权衡等问题,没有进行充分的分析和解决。针对三数据中心这种最常用的多数据中心场景,首先,提出了纠删码在系统设计中重要的4个因素:冗余度、可靠性、容错能力及解码带宽。其次,根据提出的4个因素,设计了一种单数据中心下满足最优带宽修复的S-LRC算法。再根据提出的S-LRC算法,设计了满足三中心架构体系下的G-LRC算法。相比传统的编码方案,提出的G-LRC算法具有更高的可靠性、更大的容错性及解码带宽惩罚比。其两节点故障时解码带宽惩罚比仅为传统方案的1/7~2/7。最后,将G-LRC算法在大文件存储系统中进行了实现和验证,并且设计了解码最优决策算法来减少修复的带宽,解决了非最大距离可分割码算法在系统中落地难的问题。Erasure coding algorithms are widely applied in both single-data-center and multi-data-center environments.Current research on erasure coding algorithms primarily focuses on storage cost and repair bandwidth.However,challenges such as how to perform repairs across multiple data centers with limited dedicated bandwidth and switch constraints,and how to balance key factors like reliability and fault tolerance,have not been thoroughly analyzed or addressed.This study targets the three-data-center scenario,which is one of the most common multi-data-center configurations.First,it identifies four crucial factors for erasure coding in system design:redundancy,reliability,fault tolerance,and decoding bandwidth.Based on these factors,it then proposes an single-data-center local reconstruction code(S-LRC)algorithm that achieves optimal bandwidth repair within a single data center.Building on the S-LRC algorithm,it further develops the global local reconstruction code(G-LRC)algorithm to accommodate the architecture of a three-data-center setup.Compared to traditional coding schemes,the proposed G-LRC algorithm offers higher reliability,greater fault tolerance,and a lower decoding bandwidth penalty.Specifically,for two-node failures,the decoding bandwidth penalty of G-LRC is only 1/7~2/7 of that of traditional schemes.Finally,the G-LRC algorithm is implemented and validated in a large file storage system,where an optimal decoding decision algorithm is designed to reduce repair bandwidth,addressing the deployment challenges of non maximum distance separable(non-MDS)codes in practical systems.

关 键 词:三数据中心 纠删码 局部可修复码 最大可恢复编码 REED-SOLOMON码 

分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象