机构地区:[1]北京航空航天大学计算机学院,北京100191
出 处:《计算机学报》2018年第10期2221-2235,共15页Chinese Journal of Computers
基 金:国家自然科学基金项目(61133004;61502019);重点研发项目(2016YFB0200100)资助~~
摘 要:海冰模式是地球模式的重要组成部分,其使用不同的网格和时间梯度来模拟海冰区域随时间的变化.海冰模式具有计算密集的特性,随着海冰模式计算精度的提升,传统的硬件已难以满足其计算需求.申威太湖之光超级计算机是第一台峰值性能超过100Pflops的超级计算机,其为高精度的海冰模式过程模拟提供了新的硬件平台,但在该平台上实现算法高效并行化仍面临着诸多问题.一些应用程序已经在众核平台上实现移植和并行化,但是相比其他领域,气候软件在众核平台移植和并行化的过程相对缓慢.有关气候模式在众核平台的并行化研究大多基于GPU实现.早期的研究多基于单个气候运算过程,该过程通常为计算密集型程序,通信过程相对较少,基于GPU的实现可以取得较好的并行效果.与单一的运算过程不同,海冰模式程序需要与多个气候模式进行交互,如何减少通信过程开销以及如何充分利用申威处理器所提供的并行性能是我们遇到的主要问题.为解决这一问题,该文基于申威众核处理器,设计了一种针对海冰模式算法移植和并行化的方法.每个申威众核处理器包含有4个核组,每个核组包含有一个管理核心和64个计算核心.为充分发掘申威众核处理器的并行特性,该方法分别对海冰模式数据分割方式,数据传输过程以及计算方式进行了改进和优化.该文利用该方法对海冰模式的两个算法进行了移植和并行化,并使用CICE测试数据集和COREv2数据集对该方法的性能进行测试.实验表明,并行优化后的两个算法相较其只在管理核心上运行分别可获得11.6倍和9.8倍的性能提升,且与基本并行化方法相比,该方法最高可获得40%的性能提升.Sea ice model is an important part of the earth system model which uses a finite difference grid and time stepping to simulate the sea ice vicissitudes.Sea ice model simulations can take from hours to days to complete due to the compute-intensive nature of the model.As a result,the size and resolution of simulations are constrained by the performance limitations of modern computing hardware.Sunway TaihuLight supercomputer is the world’s first system with a peak performance greater than 100 Pflops.It brings a new opportunity for high resolution earth system model simulations but exploiting the parallelism in this architecture is not trivial.Some applications have been ported and parallelized on many core architecture platforms.But when compared with other HPC application domains,the porting of the climate models onto many-core architectures has been relatively slow.Most of efforts are based on GPU.And most early-stage efforts focused on the physics modules.Since the modules are usually compute-intensive and do not involve communications,GPU-based acceleration can generally achieve a speedup of one order of magnitude.Different from the projects mentioned above,our work focuses on CICE 5.1 and uses Sunway many-core processor as our hardware platform.Since CICE sea ice model has many relationships with other climate components,optimizing communications between Sunway many-core processors becomes an important issue we need to address.Besides,tuning the original CICE algorithms to fit the new calculation elements is another problem we need to resolve.To address these issues,in this work we implement a new parallelization of sea ice model algorithm on Sunway many-core processor.Sunway many-core processor architecture is much different from the architecture of CPU which we use now.Each Sunway many-core process contains 4 CGs(Core Groups),each CG contains 1 MPE(Management Process Element)and 64 CPEs(Computing Process Element).To exploit the massive parallelism offered by Sunway many-core processor,we propose a parallel
关 键 词:申威众核处理器 海冰模式 数据传输 数据分割 计算方式
分 类 号:TP301[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...