检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:宋佩涛 张志俭 梁亮 张乾 赵强 SONG Peitao;ZHANG Zhijian;LIANG Liang;ZHANG Qian;ZHAO Qiang(Fundamental Science on Nuclear Safety and Simulation Technology Laboratory,Harbin Engineering University,Harbin 150001,China)
机构地区:[1]哈尔滨工程大学核安全与仿真技术国防重点学科实验室
出 处:《原子能科学技术》2019年第11期2209-2217,共9页Atomic Energy Science and Technology
基 金:核反应堆系统设计技术重点实验室运行基金资助项目;黑龙江省青年科学基金资助项目(QC2018003);数字化反应堆工程化关键技术研究资助项目(J121217001)
摘 要:CPU-GPU异构系统为加速全堆芯特征线方法(MOC)精细计算提供了方法和思路。在实现基于CPU-GPU异构系统的二维MOC异构并行算法基础上,提出了性能分析模型,识别了影响异构并行算法并行效率的主要因素;针对识别到的性能影响因素,实现了输运计算与数据传递相互掩盖,提升了异构并行算法的整体并行效率。数值结果表明:程序具备良好的计算精度;数据传递(MPI通信和CPU与GPU之间的数据拷贝)是影响异构并行算法并行效率的主要因素;实现输运计算与数据传递相互掩盖后,程序性能和强并行效率均有所提升;5异构节点(包含20块GPU)并行时,程序整体效率提升达8%,强并行效率从87%提升到95%;相比CPU节点并行计算,4个CPU-GPU异构节点整体性能优于20个CPU节点。The CPU-GPU heterogeneous system provides method and idea for accelerating the whole-core MOC(method of characteristics)neutron transport calculation.A performance analysis model was proposed to identify the factors which significantly impact the parallel efficiency of the 2D MOC heterogeneous parallel algorithm based on the CPU-GPU heterogeneous system.Then the overall parallel efficiency was improved by the transport sweep and the data movement overlapping after the performance analysis.The numerical results demonstrate that the parallel algorithm maintains the desired accuracy.The data movement which includes the MPI communication and the data copy between CPU and GPU is the main factor affecting the parallel efficiency of heterogeneous parallel algorithm.The overall performance and the strong scaling efficiency are improved with the transport sweep and the data movement overlapping.About 8%improvement is observed in the overall performance and the strong scaling efficiency reaches 95%from 87%when 5 heterogeneous nodes(including 20 GPUs)are utilized to perform the simulation.Compared against the CPU-based parallelization,the overall performance of 4 CPU-GPU heterogeneous nodes outperforms the performance of 20 CPU nodes.
关 键 词:异构并行 特征线方法 中子输运计算 GPU CUDA
分 类 号:TL329[核科学技术—核技术及应用]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.19.218.250