HW/SW Co-optimization for Stencil Computation：Beginning with a Customizable Core

HW/SW Co-optimization for Stencil Computation:Beginning with a Customizable Core

作　　者：Yanhua Li Youhui Zhang Weiming Zheng

机构地区：[1]Department of Computer Science, Tsinghua University, Beijing 100084, China

出　　处：《Tsinghua Science and Technology》2016年第5期570-580,共11页清华大学学报（自然科学版（英文版）

基　　金：supported by the National HighTech Research and Development (863) Program of China (No. 2013AA01A215);the Brain Inspired Computing Research of Tsinghua University (No. 20141080934)

摘　　要：Energy efficiency is one of the most important issues for High Performance Computing（HPC） today.Heterogeneous HPC platform with some energy-efficient customizable cores（as application-specific accelerators）is believed as one of the promising solutions to meet ever-increasing computing needs and to overcome power density limitations. In this paper, we focus on using customizable processor cores to optimize the typical stencil computations—— the kernel of many high-performance applications. We develop a series of effective software/hardware co-optimization strategies to exploit the instruction-level and memory-computation parallelism,as well as to decrease the energy consumption. These optimizations include loop tiling, prefetching, cache customization, Single Instruction Multiple Data（SIMD）, and Direct Memory Access（DMA）, as well as necessary ISA extensions. Detailed tests of power-efficiency are given to evaluate the effect of all these optimizations comprehensively. The results are impressive： the combination of these optimizations has improved the application performance by 341% while the energy consumption has been decreased by 35%; a preliminary comparison with X86, GPU, and FPGA platforms also showed that the design could achieve an order of magnitude higher performance efficiency. We believe this work can help understand sources of inefficiency in general-purpose chips and can be used as a beginning to customize an energy efficient CMP for further improvement.Energy efficiency is one of the most important issues for High Performance Computing（HPC） today.Heterogeneous HPC platform with some energy-efficient customizable cores（as application-specific accelerators）is believed as one of the promising solutions to meet ever-increasing computing needs and to overcome power density limitations. In this paper, we focus on using customizable processor cores to optimize the typical stencil computations—— the kernel of many high-performance applications. We develop a series of effective software/hardware co-optimization strategies to exploit the instruction-level and memory-computation parallelism,as well as to decrease the energy consumption. These optimizations include loop tiling, prefetching, cache customization, Single Instruction Multiple Data（SIMD）, and Direct Memory Access（DMA）, as well as necessary ISA extensions. Detailed tests of power-efficiency are given to evaluate the effect of all these optimizations comprehensively. The results are impressive： the combination of these optimizations has improved the application performance by 341% while the energy consumption has been decreased by 35%; a preliminary comparison with X86, GPU, and FPGA platforms also showed that the design could achieve an order of magnitude higher performance efficiency. We believe this work can help understand sources of inefficiency in general-purpose chips and can be used as a beginning to customize an energy efficient CMP for further improvement.

关键词：energy efficiency customizable processor stencil computation software and hardware co-optimization

分类号：TP38[自动化与计算机技术—计算机系统结构]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

HW/SW Co-optimization for Stencil Computation：Beginning with a Customizable Core

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

HW/SW Co-optimization for Stencil Computation：Beginning with a Customizable Core

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索