面向存储层次设计优化的GPU程序性能分析  被引量:2

Performance Analysis of GPU Programs Towards Better Memory Hierarchy Design

在线阅读下载全文

作  者:唐滔[1] 彭林[1] 黄春[1] 杨灿群[1] TANG Tao;PENG Lin;HUANG Chun;YANG Can-qun(College of Computer Science,National University of Defense Technology,Changsha410073,China)

机构地区:[1]国防科学技术大学计算机学院,长沙410073

出  处:《计算机科学》2017年第12期1-10,共10页Computer Science

基  金:国家自然科学基金(61402488);教育部博士点基金(20134307120035)资助

摘  要:图形处理器凭借着比传统CPU更高的峰值性能和能效,以及日渐成熟的软件环境,逐渐成为构建异构并行系统的最流行的加速器之一。虽然GPU依靠轻量级线程的灵活切换来隐藏访存延迟,但其超高的并发度仍然给存储系统带来了很大压力,其性能的有效发挥受访存效率的强烈影响。因此GPU程序的访存行为分析及优化一直是GPU相关领域的研究热点,但很少有工作从体系结构的角度分析存储层次的设计对性能的影响。为了更好地指导GPU存储层次的设计和访存优化,从实验的角度详细地分析了GPU各存储层次对程序性能的影响,并总结出若干指导性的优化策略,为未来类似体系结构的存储层次设计和程序优化提供建议。With higher peak performance and energy efficiency than CPUs,as well as increasingly mature software environment,GPUs have become one of the most popular accelerators to build heterogeneous parallel computing systems.Generally,GPU hides memory access latency through flexible and light-weight thread switch mechanism,but its memorysystem faces severe pressure because of the massive parallelism and its actual performance is enormously impacted bythe efficiency of memory access operations.Therefore,the analysis and optimization of GPU program's memory accessbehavior have always been hot research topics in GPU-related studies.However,few existing works have analyzed theimpact of memory hierarchy design on performance from the view of architecture.In order to better guide the design ofGPU's memory hierarchy and program optimizations,we analyzed the influence of GPU's each memory hierarchy onthe program performance in detail from the view of experiment in this paper,and summarized several strategies for boththe memory hierarchy design of future GPU-like architectures and program optimizations.

关 键 词:异构系统 图形处理器 存储层次 性能分析 优化 

分 类 号:TP302.7[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象