一种改进的适合并行计算的共轭剩余算法  被引量:5

An Improved Conjugate Residual Algorithm for Large Symmetric Linear Systems

在线阅读下载全文

作  者:刘杰[1] 刘兴平[2] 迟利华[1] 胡庆丰[1] 

机构地区:[1]国防科学技术大学计算机学院并行与分布处理国家重点实验室,长沙410073 [2]北京应用物理与计算数学研究所,北京100088

出  处:《计算机学报》2006年第3期495-499,共5页Chinese Journal of Computers

基  金:国家自然科学基金(40245023);计算物理国家重点实验室基金(51479040103KG0201)资助

摘  要:通过改变CR算法的计算次序,提出了一种改进的共轭剩余(ICR)算法.对比CR算法,ICR算法的数值稳定性和CR算法相同,几乎没有增加计算量,但考虑了在MIMD并行机上实现时并行算法的性能,其同步开销减少为CR算法的一半,并且所有内积计算以及矩阵向量乘是独立的,没有数据相关性,可以进行计算与通信的重叠.从理论和实验两个角度来讨论ICR算法的性能,当处理机台数较多时ICR算法的计算速度快于CR算法.在64台处理机机群上进行的数值实验表明,并行ICR算法的计算速度大约比CR算法快30%.The conjugate residual (CR) algorithm is a Krylov subspace algorithm that can be used to obtain fast solutions for symmetric linear systems with very large and very sparse coefficient matrices. By changing the computation sequence in the CR algorithm, this paper proposes an improved Conjugate Residual (ICR) algorithm. The numerical stability of ICR algorithm is same as CR algorithm, but the synchronization overhead that represents the bottleneck of the parallel performance is effectively reduced by a factor of two. And all inner products of a single iteration step are independent and communication time required for inner product can be overlapped efficiently with computation time of vector updates. From the theoretical and experimental analysis it is found that ICR algorithm is faster than CR algorithm as the number of processors increases. The experiments performed on a 64-processor cluster indicate that ICR is approximately 30G faster than CR.

关 键 词:共轭剩余算法 同步开销 并行计算 机群 大型对称稀疏线性方程组 

分 类 号:TP301[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象