检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:曹德胜 程刚 徐帮树 CAO De-sheng;CHENG Gang;XU Bang-shu(School of Computer Science North China Institute of Science and Technology,Langfang Hebei 101601,China;School of Qilu Transportation,Shandong University,Jinan Shandong 250061,China)
机构地区:[1]华北科技学院计算机学院,河北廊坊101601 [2]山东大学齐鲁交通学院,山东济南250061
出 处:《计算机仿真》2024年第10期328-332,共5页Computer Simulation
基 金:国家自然科学基金项目(42377200)。
摘 要:由于无损备份会记录每次变更的数据,因此产生的备份数据量非常大,导致备份过程过长,使得数据存储效果并不理想。为了优化大数据背景下的数据存储方式,提出考虑带宽限制的无损数据库分布式增量备份。引入相似度计算原理,提取数据库内缺陷数据的相似属性邻近数据,结合自适应多级决策树优化(Group Method of Data Handling,GMDH)算法构建最优复杂度计算结构,对缺陷数据插补。通过无损数据压缩(Lempel-Ziv-Welch,LZW)算法,压缩插补后的数据。用不同维度的数据特征向量描述数据的类别,结合重采样(Bootstrap resampling,Bootstrap)算法与概率理论判断数据类别,将不同类别数据备份至增量备份树的不同分支内,在数据更新条件下,通过增量备份树分支节点数据的查询,实现非重复性数据的增量备份。实验表明,所提方法能够在低带宽占用条件下,实现数据的高效增量备份,对应用数据的保护具有重要意义。Since lossless backup records the data changed every time,the amount of backup data generated is very large,resulting in a long backup process,which makes the data storage effect unsatisfactory.In order to optimize the data storage mode in the context of big data,a lossless database distributed incremental backup considering bandwidth constraints is proposed.The similarity calculation principle is introduced to extract the similar attribute adjacent data of defect data in the database,and the adaptive multi-level decision tree optimization(Group Method of Data Handling,GMDH)algorithm is combined to build the optimal complexity calculation structure to interpolate defect data.The interpolated data is compressed through the lossless data compression(Lempel-Ziv-Welch,LZW)algorithm.The data feature vectors of different dimensions are used to describe the categories of data,and the data categories are judged by combining the resampling(Bootstrap resampling,Bootstrap)algorithm and probability theory.The data of different categories are backed up to different branches of the incremental backup tree.Under the condition of data update,the incremental backup of nonrepetitive data is realized by querying the branch node data of the incremental backup tree.Experiments show that the proposed method can achieve efficient incremental backup of data under the condition of low bandwidth occupation,which is of great significance to the protection of application data.
分 类 号:TP309[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.15.149.213