Hadoop副本放置策略  被引量:7

Research on the replica placement strategy of Hadoop

在线阅读下载全文

作  者:邵秀丽[1] 王亚光[1] 李云龙[1] 刘一伟[2] 

机构地区:[1]南开大学信息技术科学学院,天津300071 [2]北京大学数学科学学院,北京100871

出  处:《智能系统学报》2013年第6期489-496,共8页CAAI Transactions on Intelligent Systems

基  金:天津市滨海新区科技项目资助项目(12ZCDZGX46700;13ZCZDGX02500)

摘  要:分布式文件系统(Hdfs)采用随机的副本放置策略使得系统在运行一段时间后会出现数据分布不均衡的情况,从而降低数据的可靠性和读取速率.为解决Hdfs默认副本放置策略存在的问题,对Hdfs副本放置策略进行改进:在副本放置选择时优先考虑存储使用率低的节点.模拟实验一测试了机架数目对于算法的影响,结果显示改进后的副本放置策略中,机架数目对集群的均衡性影响很小,显示出较好的均衡性.模拟实验二测试了随着写入数据的增加,比较了使用改进前后的副本放置策略集群中节点使用率的标准差,证实了改进后的副本放置策略在存储均衡方面较原放置策略有着更好的表现.Hadoop distributed file system applies the random replica placement strategy, which causes unbalanced data distribution after the system has operated for a while, resulting in lowering the data reliability and reading speed . In order to eliminate the defect of the replica placement strategy defaulted by the Hdfs, the strategy was improved. When the placement location of a replica is selected, a node with a low storage and use rate will be considered as a priority. The first simulation experiment tested the effects caused by the number of racks on the algorithm. The results show that, for the improved replica placement strategy, the number of racks has little impact on the equilibrium of the group, the equilibrium is excellent. The second simulation experiment compared the standard difference of the node usage rates between the replica placement strategy groups before and after and found an improvement following the increase of the data input. The results verify that the improved replica placement strategy has better performance with respect to storage equilibrium.

关 键 词:云存储 HDFS 副本放置 存储均衡 存储节点 

分 类 号:TP333[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象