一种基于数据分析访问的公共数据模型  被引量:1

A common data model based on data analysis access

在线阅读下载全文

作  者:滕爱国 谭晶 查易艺 陈飞园 吴震寰 

机构地区:[1]国网江苏省电力公司信息通信分公司,江苏南京210000 [2]国网江苏省电力公司,江苏南京210000 [3]北京友友天宇系统技术有限公司,北京100022

出  处:《无线互联科技》2018年第2期104-107,共4页Wireless Internet Technology

摘  要:近20年来,中子和同步加速器科学共同体一直希望有一种共同的数据格式,用于交换实验结果和应用,以便减少数据、分析数据。使用HDF5作为数据载体已成为许多设施的标准。最大的问题是在HDF5内的数据组织(模式)的标准化。通过为数据访问引入新的间接层:公共数据模型访问(CDMA)框架,文章提出了一种解决方案,允许数据压缩开发人员与研究所责任分离:数据压缩开发人员负责数据简化代码;研究所提供访问数据的插件。CDMA是一种核心API,它通过科学家和研究所共同认可的数据格式插件机制和科学的程序定义(关键字集合)来访问数据。在应用程序定义和物理数据组织之间使用一种新型映射系统,CDMA允许数据压缩程序独立于数据文件载体和模式下开发。每个机构都为自己的数据文件格式开发一个数据访问插件,以及程序定义和数据文件之间的映射。因此,数据压缩程序可以从严格科学的角度开发,并能立即处理来自多个研究所的数据。For nearly 20 years, the neutral and synchrotron science community has been hoping for a common data format for exchanging experimental results and applications in order to reduce and analyze data. The use of HDF5 as a data carrier has become the standard for many facilities. The biggest problem is the standardization of data organization (schema) within HDF5. By introducing a new indirect layer for data access: the Common Data Model Access (CDMA) framework, a solution is proposed that allows separation of responsibilities between data reduction developers and the institute. Data reduction developers are responsible for data reduction code; the institute provides plug-ins for accessing data. CDMA is a core API that accesses data through a data format plug-in mechanism and a scientific application definition (sets of keywords) that is commonly recognized by scientists and research institutes. The use of a new mapping system between application definition and physical data organization enables CDMA to allow data reduction application to be developed independently of data file carrier and schema. Each institute develops a data access plug-in for its own data file format, as well as mapping between application definitions and data files. As a result, data reduction applications can be developed from a rigorously scientific perspective and can immediately process data from multiple research institutes.

关 键 词:公共数据模型访问 数据分析 数据可视化 数据压缩 字典机制 

分 类 号:TP311.1[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象