基于异构平台的三角矩阵回代加速求解研究  

Research on accelerating triangular matrix backpropagation based on heterogeneous platforms

在线阅读下载全文

作  者:时睿 左芸帆 闫浩 SHI Rui;ZUO Yunfan;YAN Hao(School of Integrated Circuits,Southeast University,Nanjing 210096,China)

机构地区:[1]东南大学集成电路学院,南京210096

出  处:《集成电路与嵌入式系统》2024年第1期13-18,共6页INTEGRATED CIRCUITS AND EMBEDDED SYSTEMS

基  金:江苏省研究生实践创新计划“面向CCS时序模型的矩阵回代FPGA加速”(SJCX220052)。

摘  要:瞬态电路仿真中常建立线性系统模型,而顺序求解多右端项的三角矩阵十分耗时。为了提高瞬态电路仿真中耗时的三角矩阵回代速度,提出了一种基于异构平台的并行计算方法快速求解三角矩阵。通过优先计算与解向量相关的乘法,挖掘了回代计算的并行性。设计了核心是多个浮点计算功能的运算阵列以及主从两层状态机的控制模块。相比于使用MKL求解库的Intel 24核CPU平台,本架构基于XCZU15EG的Zynq UltraScale系列FPGA进行了线性矩阵求解实验,实验所用矩阵均为对称正定、对角占优且稠密度均大于50%。提出的加速架构求解的平均加速比达到22倍,求解误差在10-17~10-14内。实验结果表明,该架构一定程度上提高了矩阵求解速度,适合于较高维度线性矩阵的前后向回代求解。Transient circuit simulations often necessitate the construction of linear system models,where the sequential solution of triangular matrices with multiple righthand terms becomes a timeintensive process.To expedite the computational efficiency of backsubstitution for these matrices in transient circuit simulations,this paper proposes a parallel computing method based on a heterogeneous platform.The method prioritizes the computation of multiplications relevant to the solution vector,exploiting the inherent parallelism of backsubstitution calculations.The architecture features a core operation array with multiple floatingpoint calculation units and a control module employing a twotiered masterslave state machine.Using the Zynq UltraScale series FPGA,specifically the XCZU15EG model,our architecture is compared to an Intel 24core CPU platform utilizing the MKL solving library in linear matrix resolution experiments.The matrices used exhibit characteristics of being symmetric positive definite,diagonally dominant,and dense with a sparsity exceeding 50%.The proposed acceleration architecture achieves an average speedup factor of 22,with solution errors falling within the range of 10-17 to 10-14.The experiment results demonstrate the architecture's significant enhancement of matrix solution speed,especially suitable for forward and backward substitution resolution of highdimensional linear matrices in transient circuit simulations.

关 键 词:三角矩阵求解 硬件加速 现场可编程门阵列 瞬态仿真 

分 类 号:TP332.1[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象