检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:龚曙光[1] 刘奇良[1] 卢海山 周志勇[1] 张佳[1]
出 处:《计算力学学报》2015年第6期745-751,共7页Chinese Journal of Computational Mechanics
基 金:国家自然科学基金(51375417;51405415)资助项目
摘 要:针对无网格Galerkin法计算耗时的问题,采用逐节点对法来组装刚度矩阵、共轭梯度法求解基于CSR格式存储的稀疏线性方程组,提出了一种利用罚函数法施加本质边界条件的EFG法GPU加速并行算法,给出了刚度矩阵和惩罚刚度矩阵的统一格式,以及GPU加速并行算法的流程图。编写了基于CUDA构架平台的GPU程序,且在NVIDIA GeForce GTX 660显卡上通过数值算例对所提算法进行了性能测试与分析比较,探讨了影响加速比的因素。算例结果验证了所提算法的可行性,并在满足计算精度的前提下,其加速比最大可达17倍;同时线性方程组的求解对加速比起决定性影响。In order to reduce the computing time of Element-Free Galerkin(EFG) method,a GPU acceleration parallel algorithm of EFG method that essential boundary condition is imposed by penalty function method is proposed, in which stiffness matrix is assembled by node pair-wise approach ,and sparse linear equations based on CSR format is solved by conjugate gradient methods. The unified format of stiffness matrix and penalty stiffness matrix was derived, and the flow chart of the parallel algorithm was provided. The GPU codes were programmed on CUDA,and algorithm testing was finished on the device of NVIDIA GeForce GTX 660 by numerical examples. The factors of affecting speedup ratio were discussed. The example results verified the feasibility of the proposed algorithm. The maximum speedup ratio was up to 17 times on the premise that the calculating accuracy is met,and to solve linear equations is the major factor in the speedup.
关 键 词:无网格GALERKIN法 GPU加速 并行计算 CUDA
分 类 号:TH123[机械工程—机械设计及理论] O241.82[理学—计算数学]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15