地球系统模型(CESM)移植到ARM高性能计算集群的实证研究  

Empirical study on porting community earth system model to ARM HPC clusters

在线阅读下载全文

作  者:徐海啸[1] 吴旗[1] 于洪梅[1] 徐哲文 李想[1] 赵禹[2] 刘治奇 XU Haixiao;WU Qi;YU Hongmei;XU Zhewen;LI Xiang;ZHAO Yu;LIU Zhiqi(College of Computer Science and Tecnology,Jilin University,Changchun 130012,China;China-Japan Union Hospital of Jilin University,Changchun 130013,China;Research and Development Institue,China FAW Group Corporation,Changchun 130000,China)

机构地区:[1]吉林大学计算机科学与技术学院,吉林长春130012 [2]吉林大学中日联谊医院,吉林长春130013 [3]中国第一汽车集团公司研发总院,吉林长春130000

出  处:《实验技术与管理》2023年第11期40-45,70,共7页Experimental Technology and Management

摘  要:气候模拟非常具有挑战性,涉及大量相互作用的物理过程。地球系统模型(CESM)是一个开源的耦合气候系统,广泛应用于区域和全球气候预测。CESM应用运行需要做大量的数值计算,超高分辨率气候模拟则需要更大规模的并行计算能力。近年来,基于ARM的高性能计算集群的出现为这些需要大量计算的物理系统的运行提供了一种新的选择。可扩展性和能效是传统HPC平台的两个关键问题。与传统的X86高性能计算平台相比,基于ARM的处理器提供了更高的内存带宽和每芯片更多的内核,有利于应用程序的可扩展性。在该文的工作中,以CESM为研究对象,并将其成功移植到了基于ARM架构的华为鲲鹏处理器上。根据CESM的运行时数据,提出了一个定制的C/Fortran编译器并改进了进程调度算法。在华为鲲鹏处理器和英特尔至强处理器上进行了大量的对比实验,结果表明:在华为鲲鹏处理器上优化后的CESM实例虽然单核性能相对较低,但整体性能提升了31.78%~42.93%,并具有更好的可扩展性。Climate simulation is very challenging that it involves a large number of interacting physical processes.Community earth system model(CESM)extensively applied to predict regional and global climate,is a state-of-art open source coupled climate system.CESM application requires great amount of computation,and future ultrahigh-resolution climate simulations demand even larger scale parallelism.In recent years,the emergence of ARM-based HPC clusters have provided a novel alternative to host these cyber-physical systems.Scalability and power efficiency are two critical issues for traditional HPC(high performance computing)platforms.Compared with traditional X86 platforms,ARM-based processors provide higher memory bandwidth and more cores per chip,which can potentially benefit the application scalability.In this work,we successfully port CESM to Huawei Kunpeng platform based on ARM architecture.Based on the runtime data of CESM,a customized C/Fortran compiler is proposed and the process scheduling algorithm is improved.Extensive experiments have been conducted on Huawei Kunpeng platform and Intel Xeon platform.Results illustrate that optimized CESM instance on Huawei Kunpeng platform has notable performance improvement,31.78%~42.93%overall,and better scalability,in spite of relatively lower single core performance.

关 键 词:气候模型 基于ARM的处理器 移植 性能优化 

分 类 号:TP399[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象