Ad Hoc File Systems for High-Performance Computing  被引量:1

在线阅读下载全文

作  者:AndréBrinkmann Kathryn Mohror Weikuan Yu Philip Carns Toni Cortes Scott A.Klasky Alberto Miranda Franz-Josef Pfreundt Robert B.Ross Marc-AndréVef 

机构地区:[1]Zentrum für Datenverarbeitung,Johannes Gutenberg University Mainz,Mainz 55128,Germany [2]Center for Applied Scientific Computing,Lawrence Livermore National Laboratory,Livermore,CA 94550,U.S.A [3]Department of Computer Science,Florida State University,Tallahassee,FL 32306,U.S.A [4]Mathematics and Computer Science Division,Argonne National Laboratory,Lemont,IL 60439,U.S.A [5]Department of Computer Architecture,Universitat Politecnica de Catalunya,Barcelona 08034,Spain [6]Computer Science and Mathematics Division,Oak Ridge National Laboratory,Oak Ridge,TN 37831,U.S.A [7]Computer Science Department,Barcelona Supercomputing Center,Barcelona 08034,Spain [8]Fraunhofer Institute for Industrial Mathematics ITWM,Fraunhofer-Platz 1,Kaiserslautern 67663,Germany

出  处:《Journal of Computer Science & Technology》2020年第1期4-26,共23页计算机科学技术学报(英文版)

基  金:This work has also been partially funded by the German Research Foundation(DFG)through the German Priority Programme 1648"Software for Exascale Computing"(SPPEXA)and the ADA-FS project,and by the European Union's Horizon 2020 Research and Innovation Program under the NEXTGenIO Project under Grant No.671591;the Spanish Ministry of Science and Innovation under Contract No.TIN2015-65316;the Generalitat de Catalunya under Contract No.2014-SGR-1051;This work was performed under the auspices of the U.S.Department of Energy by Lawrence Livermore National Laboratory under Contract No.DE-AC52-07NA27344(LLNL-JRNL-779789);also supported by the U.S.Department of Energy,Office of Science,Advanced Scientific Computing Research,under Contract No.DE-AC02-06CH11357;This work is also supported in part by the National Science Foundation of USA under Grant Nos.1561041,1564647,1744336,1763547,and 1822737.

摘  要:Storage backends of parallel compute clusters are still based mostly on magnetic disks,while newer and faster storage technologies such as flash-based SSDs or non-volatile random access memory(NVRAM)are deployed within compute nodes.Including these new storage technologies into scientific workflows is unfortunately today a mostly manual task,and most scientists therefore do not take advantage of the faster storage media.One approach to systematically include nodelocal SSDs or NVRAMs into scientific workflows is to deploy ad hoc file systems over a set of compute nodes,which serve as temporary storage systems for single applications or longer-running campaigns.This paper presents results from the Dagstuhl Seminar 17202"Challenges and Opportunities of User-Level File Systems for HPC"and discusses application scenarios as well as design strategies for ad hoc file systems using node-local storage media.The discussion includes open research questions,such as how to couple ad hoc file systems with the batch scheduling environment and how to schedule stage-in and stage-out processes of data between the storage backend and the ad hoc file systems.Also presented are strategies to build ad hoc file systems by using reusable components for networking and how to improve storage device compatibility.Various interfaces and semantics are presented,for example those used by the three ad hoc file systems BeeOND,GekkoFS,and BurstFS.Their presentation covers a range from file systems running in production to cutting-edge research focusing on reaching the performance limits of the underlying devices.

关 键 词:parallel architectures distributed FILE SYSTEM high-performance computing BURST BUFFER POSIX(portable operating SYSTEM interface) 

分 类 号:TP39[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象