检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王丽芳[1] 张志珂[2] 蒋泽军[1] 蔡小斌[1] 彭成章[1]
机构地区:[1]西北工业大学计算机学院,陕西西安710072 [2]国家电网河南省电力公司,河南郑州450052
出 处:《西北工业大学学报》2014年第4期658-663,共6页Journal of Northwestern Polytechnical University
基 金:国家自然科学基金(61373120);航空科学基金(2012ZC53040)资助
摘 要:重复数据删除集群是解决不断增长的海量数据备份需求的一种有效方法。它的关键问题是数据路由策略,即如何把数据合理分配到集群内的各个节点。目前的数据路由策略利用文件或者数据段的最小数据块签名计算路由目标节点,称作MCS(minimum chunk signature)数据路由策略。当重复数据删除集群规模较小时,这种方法的存储使用量接近单节点重复数据删除。但是,当集群规模较大时,它的存储使用量远远劣于单节点重复数据删除。为了降低重复数据删除集群的存储使用量,提出一种基于路径的重复数据删除集群的数据路由策略,称作DRSD(data routing strategy based on directories)。实验结果表明,对于各种不同的节点数量,DRSD的重复数据删除率都明显高于MCS,并且接近单节点重复数据删除。当节点数量是64时,DRSD的重复数据删除率比MCS高35%。Deduplication cluster is an effective way for meeting the increasing and massive data backup require-ments. Its key problem is how to distribute the data to nodes in the deduplication cluster; this is the data routing strategy. Existing data routing strategy utilizes the MCS ( Minimum Chunk Signature) of a file or data segment to compute the target routing node. When the size of the deduplication cluster is small, the storage utilization of MCS approaches the single node deduplication. However, when the deduplication cluster is in large scale, its storage uti-lization is much lower than the single node deduplication. We propose a novel data routing strategy using directories for the deduplication cluster for decreasing the storage utilization of the deduplication cluster,;this new strategy we call DRSD( Data Routing Strategy Based on Directories) . Experimental results and their analysis show preliminarily that, for various numbers of the nodes of the deduplication cluster, the deduplication ratios obtained with DRSD are much better than those obtained with MCS, and even approach those obtained with single node deduplication. When the number of nodes is 64, the deduplication ratio obtained with DRSD is 35% better than that obtained with MCS.
关 键 词:重复数据删除集群 无状态数据路由算法 文件路径 存储使用量
分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.43