Scalability of 3D deterministic particle transport on the Intel MIC architecture  被引量:2

Scalability of 3D deterministic particle transport on the Intel MIC architecture

在线阅读下载全文

作  者:王庆林 刘杰 龚春叶 邢座程 

机构地区:[1]Science and Technology on Parallel and Distributed Processing Laboratory,National University of Defense Technology [2]Science and Technology on Space Physics Laboratory

出  处:《Nuclear Science and Techniques》2015年第5期88-97,共10页核技术(英文)

基  金:Supported by National Natural Science Foundation of China(Nos.61402039,61170083,60970033,61373032 and 91430218);National High Technology Research and Development Program of China(No.2012AA01A301);China Postdoctoral Science Foundation(No.2014M562570);National Key Basic Research Program of China(No.61312701001)

摘  要:The key to large-scale parallel solutions of deterministic particle transport problem is single-node computation performance. Hence, single-node computation is often parallelized on multi-core or many-core computer architectures. However, the number of on-chip cores grows quickly with the scale-down of feature size in semiconductor technology. In this paper, we present a scalability investigation of one energy group time-independent deterministic discrete ordinates neutron transport in 3D Cartesian geometry(Sweep3D) on Intel's Many Integrated Core(MIC) architecture, which can provide up to 62 cores with four hardware threads per core now and will own up to 72 in the future. The parallel programming model, Open MP, and vector intrinsic functions are used to exploit thread parallelism and vector parallelism for the discrete ordinates method, respectively. The results on a 57-core MIC coprocessor show that the implementation of Sweep3 D on MIC has good scalability in performance. In addition, the application of the Roofline model to assess the implementation and performance comparison between MIC and Tesla K20 C Graphics Processing Unit(GPU) are also reported.The key to large-scale parallel solutions of deterministic particle transport problem is single-node computation performance. Hence, single-node computation is often parallelized on multi-core or many-core computer architectures. However, the number of on-chip cores grows quickly with the scale-down of feature size in semiconductor technology. In this paper, we present a scalability investigation of one energy group time-independent deterministic discrete ordinates neutron transport in 3D Cartesian geometry(Sweep3D) on Intel’s Many Integrated Core(MIC) architecture, which can provide up to 62 cores with four hardware threads per core now and will own up to 72 in the future. The parallel programming model, Open MP, and vector intrinsic functions are used to exploit thread parallelism and vector parallelism for the discrete ordinates method, respectively. The results on a 57-core MIC coprocessor show that the implementation of Sweep3 D on MIC has good scalability in performance. In addition, the application of the Roofline model to assess the implementation and performance comparison between MIC and Tesla K20 C Graphics Processing Unit(GPU) are also reported.

关 键 词:计算机体系结构 可扩展性 粒子输运 三维几何 英特尔 麦克风 离散坐标法 计算性能 

分 类 号:O571.5[理学—粒子物理与原子核物理]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象