检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:谢晓燕 杜卓林 胡传瞻 杨坤 王安琪 XIE Xiao-yan;DU Zhuo-lin;HU Chuan-zhan;YANG Kun;WANG An-qi(School of Computer,Xi’an University of Posts and Telecommunications,Xi’an 710121,China;School of Electronic Engineering,Xi’an University of Posts and Telecommunications,Xi’an 710121,China)
机构地区:[1]西安邮电大学计算机学院,陕西西安710121 [2]西安邮电大学电子工程学院,陕西西安710121
出 处:《计算机工程与设计》2022年第4期1195-1200,F0003,共7页Computer Engineering and Design
基 金:国家自然科学基金项目(61834005、61772417、61802304、61602377、61634004);陕西省国际科技合作计划基金项目(2018KW-006);西安市科技计划基金项目(2019218114GXRC017CG018-GXYD17.9、GXYD17.11)。
摘 要:卷积神经网络(CNN)中大量乘加操作带来了巨大的参数量和计算量,使其在硬件加速中面临严重的访存和功耗问题。提出在4×4处理元阵列上实现同时支持1×1、3×3、5×5卷积核的28×28和32×32图像的并行重构计算方案,减少Inception网络的片上资源占用量。对输入图像进行预处理,提出一种重叠窗口的数据组织方案,将外存加载的像素数减少了30%。实验结果表明,在123 MHz的工作频率下,经过预处理的硬件访存开销降至45%,卷积计算的数据复用率达到66.7%,运行功耗为6.395 W,每瓦功率为0.176,性能较FPGA版本有明显提升。The large number of multiplication and addition operations in convolutional neural networks(CNN)brings a huge amount of parameters and calculations,which makes it face serious memory access and power consumption problems in hardware acceleration.A paralleling reconfigurable calculation scheme for 28×28 and 32×32 images that simultaneously supported 1×1,3×3,5×5 convolution kernels was implemented on a 4×4 processing element array,reducing the amount of on-chip resource consumption of the Inception network.After preprocessing the input image,a data organization scheme with overlapping windows was proposed,which reduced the number of pixels loaded in the external storage by 30%.Experimental results show that,at 123 MHz operating frequency,compared with the non preprocessing scheme,the hardware memory access cost of the pre-processed scheme is reduced to 45%,the calculated data reuse rate of convolution calculation reaches 66.7%,the operating power consumption is 6.395 W,and the power per watt is 0.176.The performance is significantly improved compared to the FPGA version.
关 键 词:Inception网络 阵列处理器 重构 重叠窗口 数据组织
分 类 号:TP183[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222