检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:郭帅哲 高建花 计卫星[1] GUO Shuaizhe;GAO Jianhua;JI Weixing(School of Computer Science and Technology,Beijing Institute of Technology,Beijing 100081,China)
出 处:《计算机科学》2024年第9期15-22,共8页Computer Science
摘 要:广义最小残差法(Generalized Minimum Residual,GMRES)是一种求解稀疏线性系统的迭代方法,被广泛应用于科学与工程计算等领域。数据量的爆炸式增长,使得GMRES算法求解的问题规模快速膨胀。为了支持大规模问题的求解,研究人员提出了面向集群的分布式GMRES算法。然而在现有的大多数集群中,节点间的网络性能仍与节点内的GPU高速互联网络存在较大差距,限制了分布式GMRES算法的性能。针对GPU集群上的分布式GMRES算法,提出了一种基于混合精度的加速求解方法,使用低精度浮点表示,显著降低了通信过程的时间开销。此外,提出了一种数据传输的精度调控算法,动态自适应调整传输数据的精度,以保证迭代算法最佳的求解效果。实验结果表明,所提基于混合精度的优化方法可实现平均2.4倍的加速比,结合其他优化方法后可实现平均7.6倍的加速比。The generalized minimum residual(GMRES)method is an iterative method for solving sparse linear systems.It is broadly used in many areas like scientific and engineering computing.The exponential data growth makes the scale of problems solved by the GMRES algorithm expand rapidly.To support the solving of large-scale problems,researchers have implemented distributed GMRES algorithm on clusters.However,the current inter-node network still significantly lags behind intra-node fa-brics in terms of both bandwidth and latency,which greatly limits the performance of the distributed GMRES algorithm.This paper proposes a mixed-precision approach for optimizing the GMRES algorithm on GPU clusters,where the data transferred is represented in a low-precision format,the network traffic during inter-GPU communication is significantly reduced.In addition,this paper proposes a balancing algorithm that dynamically adjusts the precision of the data transferred to achieve the satisfied resi-dual.Experimental results show that the proposed method achieves an average speedup of 2.4×,and a further average speedup of 7.6×when combined with other optimizations.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49