检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李永博 王琴[1] 蒋剑飞[1] LI Yong-bo;WANG Qin;JIANG Jian-fei(School of Electronic Information and Electrical Engineering,Shanghai Jiao Tong University,Shanghai 200240,China)
机构地区:[1]上海交通大学电子信息与电气工程学院,上海200240
出 处:《微电子学与计算机》2020年第6期30-34,39,共6页Microelectronics & Computer
基 金:国家自然科学基金项目(61176037)。
摘 要:为降低卷积神经网络推断时的时延和能耗,使用动态网络剪枝技术得到稀疏网络并设计出高能效比的稀疏卷积神经网络加速器.针对运算负载不均衡问题,提出适合稀疏运算的数据流;针对卷积运算高时延问题,采用16×16运算阵列提高运算并行度,设计索引单元避免无效运算,设计脉动输入层加强数据复用,采用乒乓缓存减少数据等待.综合结果表明,在TSMC 28nm工艺下,芯片工作频率可达500MHz,功耗为249.7mW,卷积运算峰值算力达到256GOPS,能效比为1.03TOPS/W.In order to reduce the latency and energy consumption of convolutional neural networks,dynamic network surgery is used to get sparse networks and a high energy efficiency sparse convolutional neural network accelerator is designed.Aiming at the problem of unbalanced computing load,a dataflow suitable for sparse computing is proposed.To reduce the latency of convolution operation,a 16×16process engine array is used to improve computation parallelism,index units are designed to avoid invalid operation,the systolic input structure is designed to enhance data reuse,and ping-pong buffers are introduced to reduce data waiting.The synthesis results showthat the frequency can reach 500MHz,the power consumption is 139mW,the peak performance is 221GOPS,and the energy efficiency is 1.59TOPS/W with TSMC 28nm process.
分 类 号:TN492[电子电信—微电子学与固体电子学]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.116