高性能YOLOv3-tiny嵌入式硬件加速器的混合优化设计  

Hybrid Optimization Design of High Performance YOLOv3-tiny Embedded Hardware Accelerator

作  者:谭会生[1] 肖鑫凯 卿翔 Tan Huisheng;Xiao Xinkai;Qing Xiang(College of Railway Transportation,Hunan University of Technology,Zhuzhou 412000,China)

机构地区:[1]湖南工业大学轨道交通学院,湖南株洲412000

出  处:《半导体技术》2025年第1期55-63,共9页Semiconductor Technology

基  金:湖南省学位与研究生教学改革研究项目(2022JGYB183)。

摘  要:为解决在嵌入式设备中部署神经网络受算法复杂度、执行速度和硬件资源约束的问题,基于Zynq异构平台,设计了一个高性能的YOLOv3-tiny网络硬件加速器。在算法优化方面,将卷积层和批归一化层融合,使用8 bit量化算法,简化了算法流程;在加速器架构设计方面,设计了可动态配置的层间流水线和高效的数据传输方案,缩短了推理时间,减小了存储资源消耗;在网络前向推理方面,针对卷积计算,基于循环展开策略,设计了8通道并行流水的卷积模块;针对池化计算,采用分步计算策略实现对连续数据流的高效处理;针对上采样计算,提出了基于数据复制的2倍上采样方法。实验结果表明,前向推理时间为232 ms,功耗仅为2.29 W,系统工作频率为200 MHz,达到了23.97 GOPS的实际算力。To solve the problem that the deployment of neural network in embedded devices is constrained by algorithm complexity,execution speed and hardware resources,a high performance YOLOv3-tiny network hardware accelerator was designed based on Zynq heterogeneous platform.In terms of algorithm optimization,the convolutional layer and batch normalization layer were fused,and the 8 bit quantization algorithm was used to simplify the algorithm process.In the accelerator architecture design,a dynamically configurable inter-layer pipeline and an efficient data transmission scheme were designed to shorten the inference time and reduce the consumption of storage resources.In the aspect of network forward inference,for convolution calculation,an 8-channel parallel pipeline convolution module was designed based on the loop unrolling strategy.For pooling calculation,a step-by-step calculation strategy was used to achieve efficient processing of continuous data streams.For the upsampling computation,a 2x upsampling method based on data replication was proposed.Experimental results show that the forward inference time is 232 ms,the power consumption is only 2.29 W,the system operating frequency is 200 MHz,and the actual computing power of 23.97 GOPS is achieved.

关 键 词:YOLOv3-tiny网络 异构平台 硬件加速器 动态配置架构 硬件混合优化 数据复制上采样 

分 类 号:TN79[电子电信—电路与系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象