GPGPU上基于运行时特征的动态并行度调度算法  

Runtime Aware Scheduling Algorithm with Dynamic Parallelism in GPGPU

在线阅读下载全文

作  者:于玉龙[1] 王宇新[2] 郭禾[1] 

机构地区:[1]大连理工大学软件学院,辽宁大连116621 [2]大连理工大学计算机科学与技术学院,辽宁大连116024

出  处:《小型微型计算机系统》2015年第12期2798-2802,共5页Journal of Chinese Computer Systems

基  金:国家自然科学基金项目(11372067)资助

摘  要:调度算法对于保障GPGPU内大规模并发线程的高效运行至关重要.调度器需要根据程序的计算特征和GPGPU内各种逻辑单元的设置情况选择合理的线程并行度.然而现有调度算法或采用静态固化并行度、或调整粒度过粗,均无法在动态调整的同时保持合理的并行度参数.基于两层次调度算法TL,通过对GPGPU运行时特征的动态监测,针对细粒度Warp调度提出了结合运行时资源使用特征和指令特征的动态并行度调度算法DTL和D2TL.在性能模拟器GPGPU-Sim上的仿真实验证明,相对传统TL调度算法,DTL和D2TL分别达到平均14.4%和19.6%的性能加速.Scheduling Algorithms are critical to high performance of the massively parallel computing in GPGPU. Schedulers are re- quired to choose reasonable thread level parallelism ( TLP ) at runtime according to the application compute patterns and the configura- tions of the various logical units in GPGPU. However, existing scheduling algorithms either employ a fixed TLP or adjust TLP in a coarse granularity, which cannot find the optimized TLP parameters at runtime. Base on the Two-Level Scheduling Algorithms (TL), Dynamic Two-Level Scheduling (DTL)and Adaptive Dynamic Two-Level Scheduling (D2TL} are proposed to dynamically adjust TLP in the fine-grained warp scheduler. DTL and D2TL monitor the hardware resource utilities and the instruction patterns, and adjust the parallelism parameter at runtime. Experiments are conducted using a performance simulator GPGPU-Sim. The experimental results show that DTL and D2TL achieve average speedups at 14.4% and 19.6% respectively compared with the original TL scheduling.

关 键 词:GPGPU 两层次调度 线程级并行 动态并行度 

分 类 号:TP331[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象