异构Hadoop集群中数据副本放置策略优化  被引量:1

Improved replica placement strategy for data in heterogeneous Hadoop cluster

在线阅读下载全文

作  者:刘艳[1] 蔡燕冬[1] 谢晓东[1] 张庆磊[1] 

机构地区:[1]华侨大学计算机科学与技术学院,福建厦门361021

出  处:《华中科技大学学报(自然科学版)》2016年第7期63-68,共6页Journal of Huazhong University of Science and Technology(Natural Science Edition)

基  金:国家自然科学青年基金资助项目(61202106;61572206);福建省自然科学基金资助项目(2013J01238)

摘  要:针对默认的Hadoop数据副本策略未考虑集群节点硬件配置的异构、文件访问特点、实时负载等信息,导致异构环境中集群计算任务本地化比例下降、影响计算性能,提出计算型数据的副本放置优化策略.量化数据访问特征以及节点实时性能和负载,以节点数据访问负载与其计算性能相匹配为原则为副本选择存储节点.实验结果表明:与默认策略相比,优化的副本放置策略能更有效地为副本选择合适的存储节点,提高计算任务本地化比例和计算性能,并使集群对节点的变动具有更好的适应性.Without considering hardware heterogeneity in cluster nodes,characteristics of data access,real workloads,the default data placement strategy applied in Hadoop distributed file system will hinder the use of data locality in Map task,leads to degradation of cluster computing performance.An optimized replica placement strategy for computational data was presented.Taking into account data access features,as well as real-time performance and workloads,to the principle of matching data access load and computing performance for each node,optimized replica placement strategy choosed appropriate storing nodes for data replicas.The results show that compared to default strategy,the proposed replica placement strategy could improve the computing performance of heterogeneous cluster,due to enhancing the advantages of data locality of Map task.Furthermore,the cluster applied optimized replica placement strategy has better stability and resilience to the change of nodes.

关 键 词:分布式文件系统 HADOOP 数据放置 异构 数据热度 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象