Efficient parallelization of SPH algorithm on modern multi-core CPUs and massively parallel GPUs  

在线阅读下载全文

作  者:Pravin Jagtap Rupesh Nasre VSSanapala B.S.V.Patnaik 

机构地区:[1]Department of Applied Mechanics Indian Institute of Technology Madras Chennai 600036,India [2]Department of Computer Science&Engineering Indian Institute of Technology Madras Chennai 600036,India [3]Indira Gandhi Centre for Atomic Research Homi Bhabha National Institute Kalpakkam 603102,India

出  处:《International Journal of Modeling, Simulation, and Scientific Computing》2021年第6期102-128,共27页建模、仿真和科学计算国际期刊(英文)

摘  要:Smoothed Particle Hydrodynamics (SPH) is fast emerging as a practically usefulcomputational simulation tool for a wide variety of engineering problems. SPH isalso gaining popularity as the back bone for fast and realistic animations in graphicsand video games. The Lagrangian and mesh-free nature of the method facilitates fastand accurate simulation of material deformation, interface capture, etc. Typically,particle-based methods would necessitate particle search and locate algorithms tobe implemented efficiently, as continuous creation of neighbor particle lists is acomputationally expensive step. Hence, it is advantageous to implement SPH, on modernmulti-core platforms with the help of High-Performance Computing (HPC) tools. Inthis work, the computational performance of an SPH algorithm is assessed on multicore Central Processing Unit (CPU) as well as massively parallel General PurposeGraphical Processing Units (GP-GPU). Parallelizing SPH faces several challenges suchas, scalability of the neighbor search process, force calculations, minimizing threaddivergence, achieving coalesced memory access patterns, balancing workload, ensuringoptimum use of computational resources, etc. While addressing some of these challenges,detailed analysis of performance metrics such as speedup, global load efficiency, globalstore efficiency, warp execution efficiency, occupancy, etc. is evaluated. The OpenMP andCompute Unified Device Architecture (CUDA) parallel programming models have beenused for parallel computing on Intel Xeon(R) E5-2630 multi-core CPU and NVIDIAQuadro M4000 and NVIDIA Tesla p100 massively parallel GPU architectures. Standardbenchmark problems from the Computational Fluid Dynamics (CFD) literature are chosen for the validation. The key concern of how to identify a suitable architecturefor mesh-less methods which essentially require heavy workload of neighbor search andevaluation of local force fields from neighbor interactions is addressed.

关 键 词:SPH GPU CUDA OPENMP HPC CFD particle based methods 

分 类 号:TP3[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象