检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李彤 姜红兰 莫海 韩杰 刘雷波 毛志刚 Tong Li;Hong-Lan Jiang;Hai Mo;Jie Han;Lei-Bo Liu;Zhi-Gang Mao(Department of Micro-Nano Electronics,Shanghai Jiao Tong University,Shanghai 200240,China;School of Integrated Circuits,Tsinghua University,Beijing 100084,China;Department of Electrical and Computer Engineering,University of Alberta,Edmonton,AB T6G 1H9,Canada)
机构地区:[1]Department of Micro-Nano Electronics,Shanghai Jiao Tong University,Shanghai 200240,China [2]School of Integrated Circuits,Tsinghua University,Beijing 100084,China [3]Department of Electrical and Computer Engineering,University of Alberta,Edmonton,AB T6G 1H9,Canada
出 处:《Journal of Computer Science & Technology》2023年第2期309-327,共19页计算机科学技术学报(英文版)
基 金:supported in part by the National Natural Science Foundation of China under Grant No.62104127;the National Key Research and Development Program of China under Grant No.2022YFB4500200.
摘 要:As a primary computation unit,a processing element(PE)is key to the energy efficiency of a convolutional neural network(CNN)accelerator.Taking advantage of the inherent error tolerance of CNNs,approximate computing with high hardware efficiency has been considered for implementing the computation units of CNN accelerators.However,individual approximate designs such as multipliers and adders can only achieve limited accuracy and hardware improvements.In this paper,an approximate PE is dedicatedly devised for CNN accelerators by synergistically considering the data representation,multiplication and accumulation.An approximate data format is defined for the weights using stochastic rounding.This data format enables a simple implementation of multiplication by using small lookup tables,an adder and a shifter.Two approximate accumulators are further proposed for the product accumulation in the PE.Compared with the exact 8-bit fixed-point design,the proposed PE saves more than 29%and 20%in power-delay product for 3×3 and 5×5 sum of products,respectively.Also,compared with the PEs consisting of state-of-the-art approximate multipliers,the proposed design shows significantly smaller error bias with lower hardware overhead.Moreover,the application of the approximate PEs in CNN accelerators is analyzed by implementing a multi-task CNN for face detection and alignment.We conclude that 1)an approximate PE is more effective for face detection than for alignment,2)an approximate PE with high statistically-measured accuracy does not necessarily result in good quality in face detection,and 3)properly increasing the number of PEs in a CNN accelerator can improve its power and energy efficiency.
关 键 词:approximate computing convolutional neural network(CNN) sum of products(SoP) data representation MULTIPLIER
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.71