重复数据删除中的无向图遍历分组预测方法  被引量:5

A Grouping Prediction Method Based on Undirected Graph Traversal in De-Duplication System

在线阅读下载全文

作  者:王龙翔[1] 张兴军[1] 朱国峰[1] 朱跃光[1] 董小社[1] 

机构地区:[1]西安交通大学电子与信息工程学院,西安710049

出  处:《西安交通大学学报》2013年第10期51-56,共6页Journal of Xi'an Jiaotong University

基  金:国家科技支撑计划资助项目(2011BAH04B03);国家自然科学基金资助项目(61173039);国家高技术研究发展计划资助项目(2011AA01A204)

摘  要:针对重复数据删除系统中存储容量受内存限制难以进行扩展的问题,提出了一种基于无向图遍历的重复数据删除分组预测方法。该方法将索引表保存在磁盘中,并在内存中维护索引表缓存,以此提高系统最大可支持的存储容量。对于索引表缓存命中率低、系统性能差的问题,采用了图遍历分组方法予以解决,根据数据块访问序列特征信息建立无向图并进行分析,基于分析结果对索引项进行分组,并以组进行缓存替换,从而提高缓存命中率和系统性能。实验结果表明,基于缓存预取原理和无向图遍历分组,在将缓存设置为索引表大小的10%时,重复数据删除存储系统最大存储容量比原有方法提高了7.5倍,缓存命中率由不进行索引项分组时的47%提高到87.6%。An index table grouping prediction method is proposed based on undirected graph traversal to solve the problem that the data storage capacity of the de-duplication system is limited Bby the memory and is difficult to expand into large-scale.The method saves the index in a disk and maintains the cache of index in memory to expand the maximum storage capacity of system.The hit rate of the grouping prediction and the system performance are improved by grouping index entries based on undirected graph traversal.The method sets up and analyzes the undirected graph based on the features of data chunk sequences,and the groups generated by analyzing the graph is used in cache replacement.Experimental results show that since the proposed method bases on the cache prefetching and the hash table grouping,the index table cache hit rate of the method increases from 47% to 87.6%,and the maximum storage capacity of IDSMS system is 7.5 times higher than that of the existing method for the cache consuming only 10% of the index table size.

关 键 词:重复数据删除 分组预测 大规模存储系统 

分 类 号:TP333[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象