CUDA下单源最短路径算法并行优化被引量：3

Parallel optimization of single source shortest path algorithm under CUDA

作　　者：张晗钱育蓉[1] 王跃飞[1] 陈人和田宸玮 ZHANG Han;QIAN Yu-rong;WANG Yue-fei;CHEN Ren-he;TIAN Chen-wei(School of Software,Xinjiang University,Urumqi 830008,China)

机构地区：[1]新疆大学软件学院

出　　处：《计算机工程与设计》2019年第8期2181-2189,共9页Computer Engineering and Design

基　　金：国家自然科学基金项目(61562086、61462079);新疆维吾尔自治区创新团队基金项目(XJEDU2017T002)

摘　　要：为设计基于固定序的Bellman-Ford算法在CUDA平台下并行优化方案,结合算法计算密集和数据密集的特点。从核函数计算层面,提出访存优化方法和基于固定序优化线程发散;从CPU-GPU传输层面,提出基于CUDA流优化数据传输开销方法。对不同显卡进行测试,参照共享内存容量划分线程块、缩减迭代后向量维度并使用CUDA流缩短首次计算时延,相比传统算法,改进后并行算法加速比在200倍左右。该并行优化方案验证了固定序在CUDA平台具有可行性和可移植性,可作为多平台研究参照。To design a parallel optimization scheme based on the fixed-order Bellman-Ford algorithm on the CUDA platform,the algorithm was computationally intensive and data-intensive.From the computational level of kernel function,the memory access optimization method and the fixed-order optimization thread divergence were proposed.From the CPU-GPU transmission level,the data transmission overhead method based on CUDA stream was proposed.After testing different graphics cards,the thread block was divided with reference to the shared memory capacity,the vector dimension was reduced after iteration,and the first calculation delay was shortened using the CUDA stream.The improved parallel algorithm has an acceleration ratio of about 200 times compared with the conventional algorithm.The parallel optimization scheme verifies that the fixed order is feasible and portable on the CUDA platform and can be used as a reference for multi-platform research.

关键词：固定序改进算法 Bellman-Ford算法并行计算性能可移植性图形处理器统一计算设备架构

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

CUDA下单源最短路径算法并行优化被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

CUDA下单源最短路径算法并行优化 被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

CUDA下单源最短路径算法并行优化被引量：3