面向图形处理器重叠通信与计算的数据划分方法  被引量:5

Novel GPU Data Partitioning Method to Overlap Communication and Computation

在线阅读下载全文

作  者:张保[1] 曹海军[1] 董小社[1] 李丹[1] 胡雷钧 

机构地区:[1]西安交通大学计算机科学与技术系,西安710049 [2]高效能服务器和存储技术国家重点实验室,济南250013

出  处:《西安交通大学学报》2011年第4期1-5,11,共6页Journal of Xi'an Jiaotong University

基  金:国家高技术研究发展计划资助项目(2009AA01Z108;2009AA01A135;2006AA01A109);中央高校基本科研业务费专项资金资助项目(08142007)

摘  要:针对"主核心+协处理器"式异构并行系统采用数据平均划分再分批执行的方法来解决主协式处理架构的额外通信开销时未能充分利用系统资源的问题,提出了一种新的数据比例划分方法.结合系统通信带宽和图形处理器(GPU)的计算能力,将应用数据按比例划分为大小不同的数据块后分批提交给GPU处理,使系统的传输资源PCI-E总线和计算资源GPU在一段时间内并行工作,从而实现了应用通信与计算的重叠.在处理按照比例划分的数据块过程中,尽可能充分利用系统的传输资源和计算资源,以减少数据传输和计算的相互等待时间.实验结果表明,采用数据比例划分方法后的应用性能明显提高,可以有效地重叠通信与计算时间,矩阵相乘和快速傅里叶变换总执行时间比未划分时分别减少了5%和30%左右,比平均划分时分别减少了3%和6%左右.A novel data partitioning method is proposed to address the problem that the "CPU+ GPU" heterogeneous parallel processing system cannot fully utilize its resources when average- partition data blocks in batches is processed to deal with the extra overhead for communication. Application data is processed by GPU after being partitioned into blocks with different sizes in proportion by taking the communication bandwidth and the GPU computing capacity into account. Therefore, PCI-E bus and GPU can work in parallel in a period of time to overlap com- munication and computation. The partitioned data blocks can utilize system resources as much as possible, and hence the mutual waiting time between data transferring and computing can be reduced. Experimental results show that application performance is raised significantly by effectively overlapping communication and computation. Comparisons with no-partition and average-partition show that matrix multiplication' s performance is improved by about 5 % and 3%, while Fast Fourier Transform's performance is enhanced by about 30 % and 6 %, respectively.

关 键 词:图形处理器 重叠通信与计算 数据划分 

分 类 号:TP399[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象