颗粒凝并动力学MonteCarlo方法的高效GPU并行计算  被引量:3

Efficient GPU parallel computing of population balance — Monte Carlo method for coagulation

在线阅读下载全文

作  者:赵海波[1] 徐祖伟[1] 刘昕[1] 史家伟[1] 郑楚光[1] 

机构地区:[1]华中科技大学煤燃烧国家重点实验室,武汉430074

出  处:《科学通报》2014年第14期1358-1368,共11页Chinese Science Bulletin

基  金:国家自然科学基金(51276077;51021065);教育部新世纪优秀人才支持项目(NCET-09-0395);多相复杂系统国家重点实验室开放课题(MPCS-2011-D-02)资助

摘  要:Monte Carlo(MC)方法作为一种求解颗粒群平衡方程(PBE)的有效方法(PBMC),由于它对多维问题的适应性、符合实际颗粒动力学特征的离散和随机本质、程序结构相对简单、易于编程实现等优点受到人们持久、普遍的关注.但在涉及到颗粒凝并问题时,常规的PBMC方法计算代价较高,与模拟颗粒数目的平方成正比,限制了其工程应用.并行计算技术的快速发展,特别是近年来NVIDIA公司提出的计算统一设备架构(CUDA)为PBMC的快速高效模拟提供了一个良好的平台.本文在CUDA平台上实现了颗粒凝并动力学PBMC的图形处理器(GPU)并行计算(分别实现了累计概率法和接受-拒绝法选择凝并对)及中央处理器(CPU)的协同处理,与目前广泛运行于CPU的串行计算相比,取得了精确的计算结果和非常明显的加速,计算代价仅与颗粒数目成正比,在当前主流GPU/CPU设备上能够达到上百倍的加速比.The population balance-Monte Carlo (PBMC) method has become increasingly popular because the discrete and stochastic nature of the MC method is especially suited for particle dynamics. However, for the two-particle events (typically, particle coagulation), the double looping over all simulation particles is required in normal PBMC methods, and simulating particle coagulation is in general a challenging computational task due to its numerical complexity and the computing cost. The compute unified device architecture (CUDA) is a programming approach for performing scientific calculations on a graphics processing unit (GPU) as a data-parallel computing device. In this article we present an implementation of accelerating PBMC method based on the Inverse scheme and the Acceptance-rejection (AR) scheme for simulating particle coagulation on the GPU. The main idea is to implement the highly threaded data-parallel processing tasks by using GPU and serial computing of complex logic and transaction processing by CPU. Furthermore, the computation accuracy of the PBMC on GPU was validated with a benchmark, a CPU-based discrete-sectional method. To evaluate the accelerating performance, the computing time on the GPU against its sequential counterpart on the CPU was compared. The speedups show that the GPU can accelerate the PBMC by a factor from decades to more than one hundred, depending on the number of simulation particle.

关 键 词:颗粒群平衡模拟 凝并 随机模拟 并行计算 CUDA 计算效率 

分 类 号:O35[理学—流体力学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象