检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:唐滔[1] 彭林[1] 黄春[1] 杨灿群[1] TANG Tao;PENG Lin;HUANG Chun;YANG Can-qun(College of Computer Science,National University of Defense Technology,Changsha410073,China)
机构地区:[1]国防科学技术大学计算机学院,长沙410073
出 处:《计算机科学》2017年第12期1-10,共10页Computer Science
基 金:国家自然科学基金(61402488);教育部博士点基金(20134307120035)资助
摘 要:图形处理器凭借着比传统CPU更高的峰值性能和能效,以及日渐成熟的软件环境,逐渐成为构建异构并行系统的最流行的加速器之一。虽然GPU依靠轻量级线程的灵活切换来隐藏访存延迟,但其超高的并发度仍然给存储系统带来了很大压力,其性能的有效发挥受访存效率的强烈影响。因此GPU程序的访存行为分析及优化一直是GPU相关领域的研究热点,但很少有工作从体系结构的角度分析存储层次的设计对性能的影响。为了更好地指导GPU存储层次的设计和访存优化,从实验的角度详细地分析了GPU各存储层次对程序性能的影响,并总结出若干指导性的优化策略,为未来类似体系结构的存储层次设计和程序优化提供建议。With higher peak performance and energy efficiency than CPUs,as well as increasingly mature software environment,GPUs have become one of the most popular accelerators to build heterogeneous parallel computing systems.Generally,GPU hides memory access latency through flexible and light-weight thread switch mechanism,but its memorysystem faces severe pressure because of the massive parallelism and its actual performance is enormously impacted bythe efficiency of memory access operations.Therefore,the analysis and optimization of GPU program's memory accessbehavior have always been hot research topics in GPU-related studies.However,few existing works have analyzed theimpact of memory hierarchy design on performance from the view of architecture.In order to better guide the design ofGPU's memory hierarchy and program optimizations,we analyzed the influence of GPU's each memory hierarchy onthe program performance in detail from the view of experiment in this paper,and summarized several strategies for boththe memory hierarchy design of future GPU-like architectures and program optimizations.
分 类 号:TP302.7[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222