在Intel Knights Corner和NVIDIA Kepler架构上OpenACC的性能可移植性分析  被引量:1

Performance Portability Evaluation for OpenACC on Intel Knights Corner and NVIDIA Kepler

在线阅读下载全文

作  者:王一超[1] 秦强[1] 施忠伟[1] 林新华[1] 

机构地区:[1]上海交通大学,上海200240

出  处:《计算机科学》2015年第1期75-78,共4页Computer Science

摘  要:OpenACC是一套基于指导语句方式的并行编程语言标准。编程者可以通过在代码中添加符合该标准的指导语句,经OpenACC编译器的编译,将串行代码并行化地移植到加速器或者协处理器上,进而获得异构加速器所带来的加速效果。OpenACC与CUDA和OpenCL这类异构并行编程技术的不同之处在于,它的目的是使编程者在应用移植过程中不需要考虑加速器或协处理器的底层硬件架构,从而降低编程难度。同时它也具有仅需维护一套代码便可在不同硬件平台上运行的优良跨平台性。因此,OpenACC是一个值得研究的并行编程标准。如今的异构加速硬件设备呈现出多元化趋势。在2013年11月的Top500榜单上排名第一的"天河二号"使用了48000块构建在Intel Knights Corner架构之上的协处理器。与此同时,发布不久的NVIDIA公司最新的Kepler架构GPU产品由于多年来的GPU市场积累也迅速形成了可观的用户群体。对于并非追求性能极限的应用移植者而言,寻求应用性能和移植简易性之间的平衡是相当重要的议题。只需要编写一套代码便可运行在这两种硬件平台上的OpenACC正迎合了用户在移植简易性上的需求。解决了移植的简易性之后,同一个应用在不同硬件平台上的性能表现便成了用户最想了解的问题。通过实验和构建性能模型向读者展示使用OpenACC移植的应用在Intel Knights Corner和NVIDIA Kepler架构硬件上的性能可移植性。OpenACC is a programming standard designed to simplify heterogeneous parallel programming by using directives.Since OpenACC can generate OpenCL and CUDA code,meanwhile running OpenCL on Intel Knight Corner is supported by CAPS HMPP compiler,it is attractive to using OpenACC on hardwares with different underlying microarchitectures.This paper studied how realistic it is to use a single OpenACC source code for a set of hardwares with different underlying micro-architectures.Intel Knight Corner and Nvidia Kepler products are the targets in the experiment,since they have the latest architectures and similar peak performance.Meanwhile CAPS OpenACC compiler is used to compile EPCC OpenACC benchmark suite,Stream and MaxFlops of SHOC benchmarks to access the performance.To study the performance portability,roofline model and relative performance model were built by the data of ex periments.It shows that at most 82% performance compared with peak performance on Kepler and Knight Corner is achieved by specific benchmarks,but as the rise of arithmetic intensity,the average performance is approximately 10%.And there is a big performance gap between Intel Knight Corner and Nvidia Kepler on several benchmarks.This study confirmed that performance portability of OpenACC is related to the arithmetic intensity and a big performance gap still exsits in specific benchmarks between different hardware platforms.

关 键 词:OpenACC 性能可移植性 高性能计算 

分 类 号:TP338.6[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象