基于GPU加速的全源对最短路径并行算法  被引量:1

All-source pair shortest path parallel algorithm based on GPU acceleration

在线阅读下载全文

作  者:肖汉 肖诗洋[2] 李焕勤 周清雷 XIAO Han;XIAO Shi-yang;LI Huan-qin;ZHOU Qing-lei(School of Information Science and Technology,Zhengzhou Normal University,Zhengzhou 450044,Henan,China;School of Civil Engineering,Southeast University,Nanjing 211189,Jiangsu,China;School of Computer and Artificial Intelligence,Zhengzhou University,Zhengzhou 450001,Henan,China)

机构地区:[1]郑州师范学院信息科学与技术学院,河南郑州450044 [2]东南大学土木工程学院,江苏南京211189 [3]郑州大学计算机与人工智能学院,河南郑州450001

出  处:《云南大学学报(自然科学版)》2023年第5期1022-1032,共11页Journal of Yunnan University(Natural Sciences Edition)

基  金:国家自然科学基金(61572444);河南省高等学校重点科研项目(22A520049).

摘  要:针对最短路径算法处理大规模数据集低效的问题,提出了基于图形处理器(Graphics Processing Unit,GPU)加速的全源对最短路径并行算法.首先通过优化矩阵乘法算法实现了在工作组内和组间进行并行运算数据,然后减少了非规则行造成的工作项分支,最后降低了工作项对邻接矩阵计算条带存储资源的访问延时.实验结果表明,与基于AMD Ryzen5 1600X CPU的串行算法、基于开放多处理(Open Multi-Processing, OpenMP)并行算法和基于统一计算设备架构(Compute Unified Device Architecture, CUDA)并行算法相比,最短路径并行算法在开放式计算语言(Open Computing Language, OpenCL)架构下NVIDIA GeForce GTX 1 070计算平台上分别获得了196.35、36.76和2.25倍的加速比,验证了提出的并行优化方法的有效性和性能可移植性.Aiming at the problem that the shortest path algorithm is inefficient in processing large-scale datasets,this paper proposes an all-pairs shortest paths parallel algorithm based on Graphics Processing Unit(GPU)acceleration.Firstly,the optimized matrix multiplication algorithm is used to realize the parallel computing data inter-workgroup and intra-workgroup.Then the branch of work-items caused by irregular rows is reduced.Finally,the access delay of work-items to the strip storage resources of adjacency matrix calculation is reduced.The experimental results show that compared with the performance of the serial algorithm based on AMD Ryzen51600X CPU,parallel algorithm based on Open Multi-Processing(OpenMP)and parallel algorithm based on Compute Unified Device Architecture(CUDA),the shortest path parallel algorithm obtains 196.35,36.76 and 2.25 times speedup in the NVIDIA GeForce GTX 1070 computing platform under the Open Computing Language(OpenCL)architecture respectively.The validity and performance portability of the proposed parallel optimization method are verified.

关 键 词:最短路径 重复平方法 图形处理器 开放式计算语言 并行算法 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象