基于多日志分析的大数据世系生成可行性证明  

Theoretical Feasibility Proof of the Multi-Log Analysis-Based Big Data Provenance Generation Method

在线阅读下载全文

作  者:高元照 陈性元[1,2] 杜学绘 李炳龙[1] GAO Yuanzhao;CHEN Xingyuan;DU Xuehui;LI Binglong(Third School,Information Engineering University,Zhengzhou 450001,Henan,China;State Key Laboratory of Cryptology,Beijing 100878,China)

机构地区:[1]信息工程大学三院,河南郑州450001 [2]密码科学技术全国重点实验室,北京100878

出  处:《武汉大学学报(理学版)》2023年第6期729-738,共10页Journal of Wuhan University:Natural Science Edition

基  金:国家重点研发计划(2018YFB0803603);科技创新特区资助项目(18-H863-01-ZT-005-017-01)。

摘  要:数据世系是实现大数据安全监管的有效方法。在大数据系统中,仅基于多日志分析的世系生成方法具有展现数据全生命周期完整世系视图的能力。但由于世系类型多样,该方法能否完整获取监管所需数据,即方法的理论可行性有待证明。为此,本文首次提出了一种形式化的证明方法。首先,对世系完整性进行了形式化定义。然后,通过对各日志包含的记录类型进行归约与关联,证明基于现有日志能否完整获取指定的世系信息;其中,提出了两种基于日志公共元素与基于程序运行原理的日志关联方法,用于对属性元素分散在多个日志中的世系类型的证明。最后,采用所提方法,证明了多日志分析方法能够用于Hadoop世系生成。Data provenance is a practical approach for data security supervision.In the big data system,only the provenance generation method based on multi-log analysis can present a complete provenance view of the data’s whole life cycle.However,due to the diversity of provenance types,the theoretical feasibility must be proved to determine whether the required provenance information for data supervision can be obtained entirely via this method.As such,this paper proposed a formal proof method for the first time.Firstly,the formal definition of provenance completeness was presented.Then,it is proven whether the specified provenance information can be fully obtained based on existing logs by reducing and associating the record types of each log.For the proof of provenance types whose attribute elements are spread out in multiple logs,two log association methods were suggested.These methods are based on common log elements and program operation principles.Finally,by adopting the proposed method,the multi-log analysis-based method proved feasible for Hadoop provenance generation.

关 键 词:大数据 世系数据生成 多日志分析 世系完整性证明 HADOOP Progger 

分 类 号:TP309.2[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象