静态场景下HDFS副本放置的改进策略  

Improvement Strategy of HDFS Replica Placement Policy under Static Scenarios

在线阅读下载全文

作  者:鲍敬源 赵宁 汪洋 BAO Jing-yuan;ZHAO Ning;WANG Yang(Wuhan Military Representative Bureau of Navy Equipment,Wuhan 430064,China;School of Computer Science and Technology,Wuhan University of Technology,Wuhan 430070,China;KE Holdings Inc.,Beijing 100089,China)

机构地区:[1]海装武汉局驻武汉地区第二军事代表处,湖北武汉430064 [2]武汉理工大学计算机科学与技术学院,湖北武汉430070 [3]贝壳找房科技有限公司,北京100089

出  处:《软件导刊》2021年第6期96-101,共6页Software Guide

摘  要:云存储系统中的副本放置策略关系到云存储的可用性和可靠性、数据负载均衡程度以及数据访问速度。在静态场景,也即数据分批次处理场景下,HDFS默认策略只对每个数据块的3个副本分别进行放置,而未综合考虑整批数据放置方案对集群负载、访问失效率、整体读写性能的影响。针对这一问题,改进基于二进制多目标灰狼优化算法的副本放置策略,以平均副本失效率、磁盘相对负载标准差及读取评价因子为目标函数,得出3个目标函数值都较低的副本放置策略。实验结果表明,采用改进策略后,集群的整体负载均衡度更高,并且在随机读取指定数量数据块的实验中,耗时也远小于默认策略。另外,改进策略未采用默认副本因子3,数据块最低有2个副本,因而减少了集群负担。Replica placement policies in cloud storage systems have a major influence not only on system availability,system reliability,and data load balance,but also on the speed of data access.Under static scenarios,data are handled batch by batch,and the default HDFS replica placement policy handles three replicas of each data block individually,which neglects the impact of the wholebatch placement on cluster load,access fail rate or the overall read/write performance.In this paper,we propose a replica placement policy based on improved multi-objective grey-wolf optimization,considering three aspects as the objective functions:mean replica fail rate,standard deviation of relative disk load,and read estimation factor,all values of which are lower compared to those of the default policy.Experimental results show that clusters achieve better overall load balance using the improved policy.In another experiment of random read of certain data blocks,the improved method consumes far less time than the default one.Additionally,The improved method doesn’t inherit the three-replicas strategy of the default method,each data block has a minimum of two replicas,which makes clusters less loaded.

关 键 词:分布式文件系统 HDFS 副本放置策略 多目标优化 云存储 

分 类 号:TP301[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象