基于HDF5的煤矿地质三维层叠网格模型分布式存储研究  被引量:2

Research on distributed storage of 3D stack grid model of coal mine geology based on HDF5

在线阅读下载全文

作  者:郭军[1,2] GUO Jun(Research Institute of Mine Big Data,CCTEG Chinese Institute of Coal Science,Beijing 100013,China;State Key Laboratory of High Efficient Mining and Clean Utilization of Coal Resources,Beijing 100013,China)

机构地区:[1]煤炭科学研究总院有限公司矿山大数据研究院,北京100013 [2]煤炭资源高效开采与洁净利用国家重点实验室,北京100013

出  处:《工矿自动化》2023年第1期153-161,共9页Journal Of Mine Automation

基  金:北京市科技计划应用技术协同创新资助项目(Z201100004520015);中国煤炭科工集团科技创新创业资金专项重点项目(2022-TD-ZD003)。

摘  要:利用真三维网格化地质模型实现煤矿地质环境的多分辨率表达和多参数的融合是煤矿地学大数据研究的重点内容之一,其核心问题是三维地质模型数据组织、存储和管理等。针对煤矿三维地质网格模型的数据规模、分布式存储和查询性能等问题,提出了一种基于HDF5的煤矿地质三维层叠网格模型分布式存储方案。在网格数据组织方面,采用层叠网格模型对三维地质模型数据进行压缩和分块组织,通过数据分块解决大规模地质网格模型数据的组织问题,数据分块同时将空间相近的数据集中在相邻的硬盘扇区或存储设备中,有利于提高数据调度效率。在数据存储方面,HDF5作为存储的持久化层,用来存储所有的原始数据,采用内存数据库Redis存储热点数据、HDF5元数据等相关信息。在Web服务方面,使用H5Serv发送和接收HDF5数据。在HDF5实现分布式方面,利用网络文件系统(NFS)实现HDF5数据在不同节点服务器之间的共享;利用Rsync和Inotify实现HDF5数据在不同节点服务器的数据实时同步;通过Nginx实现访问时反向代理和数据服务节点的负载均衡。使用Docker容器技术将数据节点服务和Nginx服务进行统一部署,通过JupyterLab交互式分析平台实现实时数据资源的调度和管理。实验结果表明:基于层叠网格的地质模型数据组织和基于HDF5的分布式存储可实现煤矿三维地质网格模型的有效存储管理和空间查询;相对于体素模型和八叉树模型,层叠网格模型数据量小,便于实现地质界面的空间快速查询,空间查询性能优于关系型数据库MySQL和非关系型数据库MongoDB,更适合煤系沉积地层结构的网格化表达和数据组织;基于HDF5的文件存储明显比MySQL和MongoDB数据库存储更加节省空间,主要原因在于HDF5的DataSet可直接存储数据块,不需要额外存储信息。基于层叠网格模型和HDF5的数据组织和存储方案可为煤The realization of multi-resolution expression and multi-parameter fusion of coal mine geological environment by using true 3D gridded geological model is one of the key contents of coal mine geological big data research.The core issues are the organization,storage and management of 3D geological model data.Aiming at the data scale,distributed storage and query performance of 3D geological grid model in coal mines,a distributed storage scheme of 3D stack grid model based on HDF5 is proposed.In terms of grid data organization,the 3D geological model data is compressed and organized in blocks by using the stack grid model.The problem of large-scale geological grid model data organization is solved by data segmentation.The data segmentation also concentrates the data with similar space in the adjacent hard disk sector or storage device.It is conducive to improving the efficiency of data scheduling.In terms of data storage,HDF5 is used as the persistence layer of storage to store all original data.The memory database Redis is used to store hot data,HDF5 metadata and other related information.In terms of Web services,H5Serv is used to send and receive HDF5 data.In terms of HDF5 distribution,network file system(NFS)is used to realize the sharing of HDF5 data between different node servers.Rsync and Inotify are used to realize real-time synchronization of HDF5 data in different node servers.Nginx is used to realize load balancing of reverse proxy and data service nodes during access.The Docker container technology is used to uniformly deploy the data node service and Nginx service.The JupyterLab interactive analysis platform is used to realize the scheduling and management of real-time data resources.The experimental results show that the data organization of the geological model based on the stack grid and the distributed storage based on HDF5 can realize the effective storage management and spatial query of 3D geological grid model of the coal mine.Compared with the voxel model and octree model,the data volume of the

关 键 词:煤矿地质模型 三维层叠网格 分布式存储 网格数据组织 空间查询 HDF5 

分 类 号:TD67[矿业工程—矿山机电]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象