基于多核的卷积神经网络加速方法与系统实现  被引量:2

Study on Realization of Convolutional Neural Network Acceleration Based on Multi-core

在线阅读下载全文

作  者:张慧明 ZHANG Huiming(VeriSilicon Microelectronics Shanghai Co.,Ltd,Shanghai 201203,China)

机构地区:[1]芯原微电子(上海)股份有限公司,上海201203

出  处:《集成电路应用》2020年第5期10-13,共4页Application of IC

基  金:上海市高新技术企业科技创新课题项目。

摘  要:分析表明,为了实现卷积神经网络的加速,经常通过并行排列多个卷积单元来实现。在理想情况下,卷积单元越多,处理速度越快。但是在实际应用中,数据带宽会大大限制卷积单元的处理速度,硬件的带宽资源非常珍贵,提高硬件的数据带宽代价巨大。因此,在有限的数据带宽和硬件开销下,提高卷积神经网络的处理速度,成为当前硬件架构设计急需解决的问题。鉴于以上所述现有技术的缺点,提供一种基于多核的卷积神经网络加速方法及系统、存储介质及终端,通过多个并行的卷积核节省卷积神经网络的数据带宽。在相同硬件数据带宽条件下,在卷积核中并行的向量点积运算提高卷积神经网络的处理速度。At present,in order to accelerate the convolution neural network,it is often realized by arranging several convolution units in parallel.In ideal case,the more convolution units,the faster processing speed.But in practical application,data bandwidth will greatly limit the processing speed of convolution unit,hardware bandwidth resources are very precious,and the cost of improving hardware data bandwidth is huge.Therefore,under the limited data bandwidth and hardware overhead,improving the processing speed of convolutional neural network becomes an urgent problem in the current hardware architecture design.In view of the disadvantages of the prior art,the invention aims to provide a multi-core based convolutional neural network acceleration method and system,storage medium and terminal,and save the data bandwidth of the convolutional neural network through multiple parallel convolution cores;under the same hardware data bandwidth,improve the convolutional neural network by parallel vector dot product operation in the convolution core processing speed.

关 键 词:卷积神经网络 数据带宽 机器学习 深度学习 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象