基于HDF5的多层次结构并行IO算法  

Multilevel Structure Parallel IO Algorithm Based on HDF5

在线阅读下载全文

作  者:马文鹏 翟环欣 李瑞莹 袁武[3,4] MA Wenpeng;ZHAI Huanxin;LI Ruiying;YUAN Wu(College of Computer and Information Technology,Xinyang Normal University,Xinyang 464000,China;College of Information and Blockchain Technology,Xinyang Vocational College of Art,Xinyang 464000,China;Computer Network Information Center,Chinese Academy of Sciences,Beijing 100083,China;Chinese Academy of Sciences University,Beijing 100049,China)

机构地区:[1]信阳师范大学计算机与信息技术学院,河南信阳464000 [2]信阳艺术职业学院信息与区块链技术学院,河南信阳464000 [3]中国科学院计算机网络信息中心,北京100083 [4]中国科学院大学,北京100049

出  处:《信阳师范学院学报(自然科学版)》2024年第4期433-441,共9页Journal of Xinyang Normal University(Natural Science Edition)

基  金:国家重点研发计划项目(2020YFB1709500);河南省重点研发与推广专项(科技攻关)(222102210162)。

摘  要:针对大规模数据输入输出的应用场景,提出了一种基于层次存储格式HDF5(Hierarchical Data Format 5)的多层次并行IO(Input/Output)方案。该并行IO方案分为节点间和节点内两层:节点间以节点为单位IO数据并允许节点内部协同或独立工作,根据节点内部的工作方式分别提出了多层次并行IO算法和多层次哨兵并行IO算法,以有效提升IO效率并避免输出文件冗余。考虑异构计算和纯CPU计算两个典型应用场景,分别在曙光平台和Intel平台进行最大核数为4096、最大数据量为256G的多组实验。结果表明,多层次并行IO算法IO效率提高了1.97~25.87倍,多层次哨兵并行IO算法IO效率提高了6.53~9.36倍,且输出文件数量减少到多区并行IO算法的1/4和1/32。A multi-level parallel IO(Input/Output)scheme based on Hierarchical Data Format(HDF5)was proposed for large-scale data input and output applications.The parallel IO scheme was divided into two layers:Inter-node IO data was taken as unit,intra-node IO data was allowed to work cooperatively or independently.According to the internal working mode of nodes,a multi-level parallel IO algorithm and a multi-level sentinel parallel IO algorithm were proposed respectively,which could effectively improve IO efficiency and avoid redundancy of output files.Considering the two typical application scenarios of heterogeneous computing and pure CPU computing,multi-group experiments with a maximum of 4096 cores and 256G data were carried out on Shuguang platform and Intel platform,respectively.The results showed that the IO efficiency of multi-level parallel IO algorithm was increased by 1.97~25.87 times.The IO efficiency of multi-level sentinel parallel IO algorithm was increased by 6.53~9.36 times,and the number of output files was reduced to 1/4 and 1/32 of the number of parallel IO algorithms.

关 键 词:层次存储格式 大规模并行计算 并行IO 数据存储 

分 类 号:TP301.5[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象