基于LDL算法的大规模矩阵求逆加速器设计及其FPGA实现  被引量:2

Design and FPGA Implementation of Large Scale Matrix Inversion Accelerator Based on LDL Algorithm

在线阅读下载全文

作  者:余浩然 肖昊 YU Haoran;XIAO Hao(School of Microelectronics,Hefei University of Technology,Hefei 230009,China)

机构地区:[1]合肥工业大学微电子学院,安徽合肥230009

出  处:《电子科技》2023年第7期1-7,共7页Electronic Science and Technology

基  金:国家自然科学基金(61974039);航空科学基金(2018ZCP4003)。

摘  要:矩阵求逆是工程计算中的基本问题,在大规模MIMO系统、阵列信号处理以及图像信号处理等应用中,大规模矩阵求逆的处理速度对系统性能至关重要,但传统矩阵求逆方法运算复杂度高、并行性低且消耗大量存储空间,不利于硬件加速。针对大规模矩阵求逆硬件加速问题,文中研究了基于LDL分解的矩阵求逆算法,并提出了一种基于该算法的大规模矩阵求逆加速架构。利用LDL分解后三角矩阵对角线元素全为1的特点,对矩阵进行分块迭代设计,减少了求逆运算的计算量,提高了计算速度。文中基于Xilinx Virtex7 FPGA设计实现了该加速器,实验结果表明,在128阶矩阵下,吞吐量达105.2 Inv·s^(-1),最高时钟频率达200 MHz。与现有矩阵求逆加速方案相比,该设计占用的硬件资源更少,且具有更高的性能。Matrix inversion is a basic problem in engineering calculation.In large-scale MIMO systems,array signal processing,image signal processing and other applications,the processing speed of large-scale matrix inversion is very important to the system performance.However,the traditional matrix inversion method has high computational complexity,low parallelism and consumes a lot of storage space,which is not conducive to hardware acceleration.Aiming at the hardware acceleration problem of large-scale matrix inversion,this study studies the matrix inversion algorithm based on LDL decomposition and proposes a large-scale matrix inversion acceleration architecture based on this algorithm.Using the characteristic that the diagonal elements of triangular matrix after LDL decomposition are all 1,the matrix is designed by block iteration,which reduces the amount of calculation and improves the calculation speed.This study designs and implements the accelerator based on Xilinx Virtex7 FPGA.The experimental results show that under the 128 order matrix,the throughput is 105.2 Inv·s^(-1) and the maximum clock frequency is 200 MHz.Compared with the existing matrix inversion acceleration scheme,this design occupies less hardware resources and has higher performance.

关 键 词:LDL分解 矩阵求逆 CHOLESKY分解 矩阵分块 三角矩阵变换 矩阵相乘 硬件加速 现场可编程门阵列 

分 类 号:TP309.7[自动化与计算机技术—计算机系统结构] TN99[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象