检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李斌[1] 钮东 吴朝晖[1] 徐会 侯健达 LI Bin;NIU Dong;WU Zhaohui;XU Hui;HOU Jianda(School of Microelectronics,South China University of Technology,Zhuhai 510640,Guangzhou,China;Zhuhai Jieli Tech.Co.,LTD,Zhuhai 519060,Guangzhou,China)
机构地区:[1]华南理工大学微电子学院,广东广州510640 [2]珠海市杰理科技股份有限公司,广东珠海519060
出 处:《微电子学与计算机》2023年第2期87-93,共7页Microelectronics & Computer
基 金:珠海市产学研合作项目:基于神经网络人工智能识别系统的研发(项目编号:ZH22017001200154PWC)。
摘 要:考虑移动端有限的计算资源,本文采用U型网络作为图像去噪的主干网络,提出了一种新的真实图像去噪算法CBDNet+.在CBDNet基础上,提出在上、下采样阶段采用小波变换,减少了乘法器的利用,更易于在资源有限的移动端实现,并且图像去噪性能较CBDNet有一定的提升.针对资源有限及低功耗的需求,对训练之后的网络进行剪枝以及8bit量化压缩,有效地提升了算法的效率并且减少了其需要的存储空间.在算法基础上,围绕硬件架构、片上缓存、性能及功耗等方面进行移动端专用型神经网络加速器的研究与设计.针对使用小波变换及小波逆变换的卷积神经网络图像去噪算法,采用专用的卷积神经网络加速器结构,降低片内外存储带宽;采用并行运算的方式,提高了小波逆变换的运算效率;在兼顾资源和速度的前提下,实现算法的加速推理.在AX7350 ZYNQ平台上实现了真实图像去噪系统,结果表明,本系统在100 MHz时钟下,平均计算性能为55.2 GOPS,功耗为1.93 W.图像去噪系统在DND测试集上测试,信噪比为36.21 dB,结构相似比为0.9435.Considering the limited computing resources of the mobile terminal,this paper adopts the U-shaped network as the backbone network of image denoising,and proposes a new real image denoising algorithm CBDNet+.On the basis of CBDNet,it is proposed to use wavelet transform in the up-sampling and down-sampling stages,which reduces the utilization of multipliers and is easier to implement on mobile terminals with limited resources,and the image denoising performance has a certain improvement compared with CBDNet.To meet the requirements of limited resources and low power consumption,the trained network is pruned and 8-bit quantized and compressed,which effectively improves the efficiency of the algorithm and reduces the required storage space.On the basis of the algorithm,the research and design of the mobile terminal-specific neural network accelerator are carried out in terms of hardware architecture,on-chip cache,performance and power consumption.For the convolutional neural network image denoising algorithm using wavelet transform and wavelet inverse transform,a dedicated convolutional neural network accelerator structure is used to reduce the storage bandwidth on and off the chip;the parallel operation is used to improve the operation efficiency of the wavelet inverse transform;Under the premise of taking into account resources and speed,the accelerated reasoning of the algorithm is realized.A real image denoising system is implemented on the AX7350 ZYNQ platform.The results show that the system has an average computing performance of 55.2 GOPS and a power consumption of 1.93 W under a clock of 100 MHz.The image denoising system is tested on the DND test set,the signal-to-noise ratio is 36.21 dB,and the structural similarity ratio is 0.9435.
分 类 号:TN492[电子电信—微电子学与固体电子学]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222