检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:韦雪明[1] 周立昕 尹仁川 许仕海 蒋丽[1] 李建华 WEI Xueming;ZHOU Lixin;YIN Renchuan;XU Shihai;JIANG Li;LI Jianhua(Guangxi Key Laboratory of Wireless Wideband Communication and signal Processing,Guilin University of Electronic Technology,Guilin 541004,China;Jiangxi Hongdu Aviation Industry Group Co.,Ltd,Nanchang 330001,China)
机构地区:[1]桂林电子科技大学广西无线宽带通信与信号处理重点实验室,广西桂林541004 [2]江西洪都航空工业集团有限责任公司,南昌330001
出 处:《桂林电子科技大学学报》2023年第6期465-472,共8页Journal of Guilin University of Electronic Technology
基 金:广西无线宽带通信与信号处理重点实验室主任基金(GXKL06200131);桂林电子科技大学研究生教育创新计划(2022YCXS034);南昌市双百计划创新团队项目。
摘 要:为解决传统“冯·诺依曼”架构功耗墙瓶颈,提升人工智能应用中点乘求和计算能效,设计了一种基于8T静态随机存储器阵列的存内计算电路,可有效解决“内存墙”问题。通过对存储单元的偏置电压设计来稳定充放电电流,可改善位线放电线性度,提高计算准确性。同时,在保证放电电流相同的前提条件下,减少了模数转换器(ADC)阈值编码,存储阵列的面积明显减小。电路基于65 nm CMOS工艺设计,通过8×72存储阵列的并行计算结构完成了64 Byte二进制点乘累加计算功能。仿真结果表明,在3位ADC输出、8 bit比较输出模式下,使用0.8、1.2 V的核心电源电压和250 MHz的时钟频率,可达到每比特1.69 GOPS/W的计算能效。与理论值基线相比,计算输出的平均计算偏差最大为1.05%,有效提高了计算准确率,并减小了电路面积。To solve the bottleneck of power wall in traditional"von Neumann"architecture and improve the energy efficiency of multiplication and accumulation(MAC)in artificial intelligence applications,an in-memory computing circuit based on 8T static random memory array was designed to effectively avoid the"memory wall"problem.The bias voltage of the storage cell was designed to stabilize the charging and discharging currents,improve the linearity of the bitline discharge and increase the accuracy of the calculation.At the same time,the analog-to-digital converter(ADC)threshold coding was reduced and the area of the memory array was significantly reduced under the premise of ensuring the same discharge current.The circuit was designed based on a 65 nm CMOS process and accomplished a 64-Bite binary point multiplication and accumulation calculation function through a parallel calculation structure of 8×72 memory arrays.Simulations show that a computational energy efficiency of 1.69 GOPS/W per bit is achieved in the 3-bit ADC output 8-bit comparison output mode,using core supply voltages of 0.8 and 1.2 V and a clock frequency of 250 MHz.Compared to the theoretical value baseline,the average calculation deviation of the calculated output is 1.05%maximum,effectively improving the calculation accuracy and reducing the circuit area.
关 键 词:存内计算 CMOS 8T SRAM 点乘累加计算 高线性度
分 类 号:TN432[电子电信—微电子学与固体电子学]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.38