检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:周长俊 宗平[2] ZHOU Chang-jun;ZONG Ping(School of Computer,Nanjing University of Posts and Telecommunications,Nanjing 210003,China;School of Overseas Education,Nanjing University of Posts and Telecommunications,Nanjing 210023,China)
机构地区:[1]南京邮电大学计算机学院,江苏南京210003 [2]南京邮电大学海外教育学院,江苏南京210023
出 处:《计算机技术与发展》2019年第1期11-16,共6页Computer Technology and Development
基 金:国家"863"高技术发展计划项目(2006AA01Z208);江苏省高校自然科学基础研究项目(06KJB520079)
摘 要:对于默认的Hadoop备份数据存放策略来说,一旦本地的数据副本发生失效,那么就需通过远端机架上存放的备份数据来实现恢复,而对于默认的备份数据存放策略,备份数据存放节点的选择具有随机性,那么可能带来的问题是不同节点间备份数据存放不均衡,数据恢复时由于距离的因素造成内部带宽的巨大消耗。针对上述问题,提出一种改进的备份数据存放策略。该策略将节点之间的距离,节点的负载以及备份数据恢复次数纳入节点选择的考虑范围,由此计算出每个节点的匹配度,随之选出匹配度最高的节点作为远端机架间的备份数据存放的最优节点。该策略不但实现了节点间备份数据放置的负载均衡,而且兼顾了数据恢复时消耗的内部带宽,将数据副本失效次数纳入考虑,实现了经常失效数据副本的快速恢复。通过在Hadoop平台上实现所提出的改进策略,结果达到了预期的要求。On the topic of the default Hadoop backup data storage strategy,once the local data copy fails,backup data stored in the remote rack should be used to restore.However,for the default backup data storage strategy,the choice of storage nodes is random,so the problem that may arise is that backup data is stored unevenly among different nodes,and the internal bandwidth is greatly consumed due to the distance when data is recovered.In order to solve these problems,we propose an improved backup data storage strategy.The strategy considers the distance between nodes,the load of nodes and the number of backup data recovery into consideration,and calculates the matching degree of each node.Thus node with the highest matching degree is selected as the optimal node for storing the backup data between the remote racks.This strategy not only realizes the load balancing of backup data placement between nodes,but also takes the internal bandwidth consumed during data recovery into account,besides that it covers the number of data copy failures and achieve rapid recovery of frequently failed data copies.By implementing the proposed improvement strategy on the Hadoop platform,the results meet the expected requirements.
关 键 词:HADOOP 备份数据存放策略 内部带宽 负载均衡 热点数据
分 类 号:TP31[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.28