检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:刘杰[1] 迟利华[1] 胡庆丰[1] 李晓梅[2]
机构地区:[1]国防科学技术大学计算机学院,长沙410073 [2]装备指挥技术学院,北京101416
出 处:《计算机研究与发展》2005年第7期1235-1240,共6页Journal of Computer Research and Development
基 金:国家自然科学基金项目(40245023);计算物理国家重点实验室基金项目(51479040103KG0201)
摘 要:TFQMR算法是一种Krylov子空间算法,常用来求解大型稀疏线性方程组.通过改变TFQMR算法的计算次序,提出了一种改进的TFQMR(ITFQMR)算法.对比TFQMR算法,ITFQMR算法的数值稳定性和TFQMR算法相同,几乎没有增加计算量,但考虑了在MIMD并行机上实现时并行算法的性能,其同步开销减少为TFQMR算法的一半,并且所有内积计算以及矩阵向量乘是独立的,没有数据相关性,可以进行计算与通信的重叠.从理论和实验两个角度来讨论ITFQMR算法的性能,当处理机台数较多时,ITFQMR算法的计算速度快于TFQMR算法.实验说明了在有64台处理机机群上进行,最快的并行ITFQMR算法的计算速度大约比TFQMR算法快20%.The transpose-free QMR(TFQMR) algorithm is a Krylov subspace algorithm that can be used to obtain fast solutions for linear systems with very large and very sparse coefficient matrices. In this paper, by changing the computation sequence in the TFQMR algorithm, an improved transpose-free QMR (ITFQMR) algorithm is proposed. The numerical stability of the ITFQMR algorithm is the same as TFQMR algorithm, but the synchronization overhead that represents the bottleneck of the parallel performance is effectively reduced by a factor of two. And all inner products of a single iteration step are independent and communication time required for inner product can be overlapped efficiently with computation time of vector updates. From the theoretical and experimental analysis it is found that the ITFQMR algorithm is faster than the TFQMR algorithm as the number of processors increases. The experiments performed on a 64-processor cluster indicate that the ITFQMR is approximately 20% faster than the TFQMR.
关 键 词:TFQMR算法 同步开销 并行计算 机群 大型稀疏线性方程组
分 类 号:TP301[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.117