基于文件路径的信息集群中重复数据消除研究  

Research on Deduplication in Information Cluster Based on File Path

在线阅读下载全文

作  者:杨美艳[1] 徐庆增[1] YANG Mei-yan;XU Qing-zeng(College of Artificial Intelligence,Tianjin University of Science and Technology,Tianjin 300457,China)

机构地区:[1]天津科技大学人工智能学院,天津300457

出  处:《计算机仿真》2022年第2期462-466,共5页Computer Simulation

摘  要:传统方法在删除重复数据时逻辑顺序较为混乱,导致重复数据消除效果欠佳。为解决上述问题,基于文件路径,对信息集群中重复数据消除方法展开研究。根据重复数据消除方法分块理念与文件系统中的目录名称,探析文件路径的重复数据消除原理。通过分块筛选存储数据完成数据对比,从而去除数据备份并用指向唯一的实例指针代替。在界定元数据信息的基础上,对文件部分和数据块模块中的重复数据进行消除,并采用文件路径信息集群的多个存储节点来加快消除速度。同时,利用元数据信息赋予数据可恢复性能,确保数据的可靠性。仿真结果表明:上述方法不仅有效提升了对重复数据的消除效果,且消除过程时耗较短,具有高效性和可靠性。Traditionally, the method is not efficient in deleting duplicate data due to its chaotic logical order. Therefore, a method of data deduplication in information cluster based on file path was studied. According to the block concept of data duplication and the directory name in file system, the principle of data duplication of file path was analyzed. Then, the data comparison was completed through the block screening for stored data, so that the data backup was removed and replaced by the pointer of instance with unique direction. After defining the metadata information, the duplicate data in file and the data block were eliminated, and the multiple storage nodes of information cluster in file path were adopted to speed up the elimination. Meanwhile, the metadata information was used to give the recoverable performance to the data. Thus, the reliability of data can be guaranteed. Simulation results show that the proposed method not only improves the elimination effect of the repeated data, but also enhances the efficiency and reliability.

关 键 词:文件路径 信息集群 重复数据 消除 目录名称 存储节点 

分 类 号:TP301[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象