检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:周杨 王佳薇[1] 黄志洪[1] 杨海钢[1,2] ZHOU Yang;WANG Jiawei;HUANG Zhihong;YANG Haigang(Institute of Electronics,Chinese Academy of Sciences,Beijing 100190,China;School of Microelectronics,University of Chinese Academy of Sciences,Beijing 100049,China)
机构地区:[1]中国科学院电子学研究所可编程芯片与系统研究室,北京100190 [2]中国科学院大学微电子学院,北京100049
出 处:《太赫兹科学与电子信息学报》2018年第2期342-346,362,共6页Journal of Terahertz Science and Electronic Information Technology
基 金:国家自然科学基金资助项目(61704173;61474120);北京市科技重大专项资助项目(Z171100000117019);北京工商大学食品安全大数据技术北京市重点实验室开放课题基金资助项目(BKBD-2017KF05)
摘 要:矩阵运算广泛应用于实时性要求的各类电路中,其中矩阵求逆运算最难以实现。基于现场可编程门阵列(FPGA)实现矩阵求逆能够充分发挥硬件的速度与并行性优势,加速求逆运算过程。基于改进的脉动阵列的计算架构,采用一种约化因子求逆的优化算法,将任意一个n×n阶上三角矩阵转换成对角线为1的上三角矩阵,使得除法运算与乘加运算分离开来,大大简化矩阵求逆运算过程。以一个4×4阶上三角矩阵求逆为例,在Xilinx ISE平台下,采用Virtex5 FPGA完成算法实现与功能验证,在14个周期内,使用了2个除法器,3个乘法器与4个加法器实现整个矩阵求逆运算。相比于经典的脉动阵列架构,仅占用近一半资源的同时,性能提升了26.43%;相比于集成更多处理单元(PE)的脉动阵列实现方式,在性能近乎不变的情况下,耗费的资源缩减到1/4,大幅度提升了资源利用率。It’s practically complicate to implement the matrix inversion which is widely used in all kinds of real-time circuit calculation.Taking Field Programmable Gate Arrays(FPGA)to implement the operation is able to take advantages of the hardware’s speed and parallelism to accelerate the matrix inversion.Based on improved systolic architecture,a simplification factor algorithm is proposed,in which any n×n upper-matrix is transferred into an upper-matrix whose diagonal value is 1 to split dividing processor with multiplying and adding processor to greatly simplify the matrix inversion.Taking a 4×4 upper-matrix as an example,the algorithm is implemented and executed the functional verification adopting Virtex5 device in Xilinx ISE and utilizing 2 dividers,3 multipliers and 4 adders in 14 cycles to accomplish all matrix inversion.Compared with the classical systolic architecture,resources are just occupied nearly half,while performance gets improved by 26.43%.Compared with systolic architecture integrated more Processing Elements(PE),the performance does not change and the resources reduce to 1/4,which greatly improves the resources utilization.
分 类 号:TN402[电子电信—微电子学与固体电子学]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.188.161.182