检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:时睿 左芸帆 闫浩 SHI Rui;ZUO Yunfan;YAN Hao(School of Integrated Circuits,Southeast University,Nanjing 210096,China)
出 处:《集成电路与嵌入式系统》2024年第1期13-18,共6页INTEGRATED CIRCUITS AND EMBEDDED SYSTEMS
基 金:江苏省研究生实践创新计划“面向CCS时序模型的矩阵回代FPGA加速”(SJCX220052)。
摘 要:瞬态电路仿真中常建立线性系统模型,而顺序求解多右端项的三角矩阵十分耗时。为了提高瞬态电路仿真中耗时的三角矩阵回代速度,提出了一种基于异构平台的并行计算方法快速求解三角矩阵。通过优先计算与解向量相关的乘法,挖掘了回代计算的并行性。设计了核心是多个浮点计算功能的运算阵列以及主从两层状态机的控制模块。相比于使用MKL求解库的Intel 24核CPU平台,本架构基于XCZU15EG的Zynq UltraScale系列FPGA进行了线性矩阵求解实验,实验所用矩阵均为对称正定、对角占优且稠密度均大于50%。提出的加速架构求解的平均加速比达到22倍,求解误差在10-17~10-14内。实验结果表明,该架构一定程度上提高了矩阵求解速度,适合于较高维度线性矩阵的前后向回代求解。Transient circuit simulations often necessitate the construction of linear system models,where the sequential solution of triangular matrices with multiple righthand terms becomes a timeintensive process.To expedite the computational efficiency of backsubstitution for these matrices in transient circuit simulations,this paper proposes a parallel computing method based on a heterogeneous platform.The method prioritizes the computation of multiplications relevant to the solution vector,exploiting the inherent parallelism of backsubstitution calculations.The architecture features a core operation array with multiple floatingpoint calculation units and a control module employing a twotiered masterslave state machine.Using the Zynq UltraScale series FPGA,specifically the XCZU15EG model,our architecture is compared to an Intel 24core CPU platform utilizing the MKL solving library in linear matrix resolution experiments.The matrices used exhibit characteristics of being symmetric positive definite,diagonally dominant,and dense with a sparsity exceeding 50%.The proposed acceleration architecture achieves an average speedup factor of 22,with solution errors falling within the range of 10-17 to 10-14.The experiment results demonstrate the architecture's significant enhancement of matrix solution speed,especially suitable for forward and backward substitution resolution of highdimensional linear matrices in transient circuit simulations.
关 键 词:三角矩阵求解 硬件加速 现场可编程门阵列 瞬态仿真
分 类 号:TP332.1[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49