检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]大连理工大学软件学院,辽宁大连116621 [2]大连理工大学计算机科学与技术学院,辽宁大连116024
出 处:《小型微型计算机系统》2015年第12期2798-2802,共5页Journal of Chinese Computer Systems
基 金:国家自然科学基金项目(11372067)资助
摘 要:调度算法对于保障GPGPU内大规模并发线程的高效运行至关重要.调度器需要根据程序的计算特征和GPGPU内各种逻辑单元的设置情况选择合理的线程并行度.然而现有调度算法或采用静态固化并行度、或调整粒度过粗,均无法在动态调整的同时保持合理的并行度参数.基于两层次调度算法TL,通过对GPGPU运行时特征的动态监测,针对细粒度Warp调度提出了结合运行时资源使用特征和指令特征的动态并行度调度算法DTL和D2TL.在性能模拟器GPGPU-Sim上的仿真实验证明,相对传统TL调度算法,DTL和D2TL分别达到平均14.4%和19.6%的性能加速.Scheduling Algorithms are critical to high performance of the massively parallel computing in GPGPU. Schedulers are re- quired to choose reasonable thread level parallelism ( TLP ) at runtime according to the application compute patterns and the configura- tions of the various logical units in GPGPU. However, existing scheduling algorithms either employ a fixed TLP or adjust TLP in a coarse granularity, which cannot find the optimized TLP parameters at runtime. Base on the Two-Level Scheduling Algorithms (TL), Dynamic Two-Level Scheduling (DTL)and Adaptive Dynamic Two-Level Scheduling (D2TL} are proposed to dynamically adjust TLP in the fine-grained warp scheduler. DTL and D2TL monitor the hardware resource utilities and the instruction patterns, and adjust the parallelism parameter at runtime. Experiments are conducted using a performance simulator GPGPU-Sim. The experimental results show that DTL and D2TL achieve average speedups at 14.4% and 19.6% respectively compared with the original TL scheduling.
分 类 号:TP331[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49