基于GPU的稀疏线性系统的预条件共轭梯度法被引量：11

GPU-based preconditioned conjugate gradient method for solving sparse linear systems

出　　处：《计算机应用》2013年第3期825-829,共5页journal of Computer Applications

基　　金：国家自然科学基金资助项目(51109072)

摘　　要：研究了基于GPU的稀疏线性方程组的预条件共轭梯度法加速求解问题,并基于统一计算设备架构(CUDA)平台编制了程序,在NVIDIAGT430 GPU平台上进行了程序性能测试和分析。稀疏矩阵采用压缩稀疏行(CSR)格式压缩存储,针对预条件共轭梯度法的算法特性,研究了基于GPU的稀疏矩阵与向量相乘的性能优化、数据从CPU端传到GPU端的加速传输措施。将编制的稀疏矩阵与向量相乘的kernel函数和CUSPARSE函数库中的cusparseDcsrmv函数性能进行了对比,最优得到了2.1倍的加速效果。对于整个预条件共轭梯度法,通过自编kernel函数来实现的算法较之采用CUBLAS库和CUSPARSE库实现的算法稍具优势,与CPU端的预条件共轭梯度法相比,最优可以得到7.4倍的加速效果。A GPU-accelerated preconditoned conjugate gradient method was studied to solve sparse linear equations. And the sparse matrix was stored in the Compressed Sparse Row （CSR） format. The programmes were coded on Compute Unified Device Architecture （CUDA） and tested on the device of nVidia GT430 GPU. According to the features of conjugate gradient method, strategies were investigated to optimize the sparse matrix vector multiplication and the data transfer between CPU and GPU. Compared with the implementation calling cusparseDcsrmv, the self-developed kernel code of sparse matrix vector multiplication can go to a speed-up of 2.1 in the best case. Equipped with this kernel, the preconditioned conjugate gradient code obtains a maximum speed-up of 7.4 against the CPU code, which is a bit advantageous over that using CUBLAS library and CUSPARSE library.

关键词：图形处理器稀疏线性方程组预条件共轭梯度法压缩稀疏行统一计算设备架构

分类号：TP312[自动化与计算机技术—计算机软件与理论]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于GPU的稀疏线性系统的预条件共轭梯度法被引量：11

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于GPU的稀疏线性系统的预条件共轭梯度法 被引量：11

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于GPU的稀疏线性系统的预条件共轭梯度法被引量：11