卷积神经网络加速器中SEU的评估与加固研究  

Evaluation and Mitigation of SEU in Convolutional Neural Network Accelerator

在线阅读下载全文

作  者:陈凯 陈鑫[1] 张颖[1] 张智维 CHEN Kai;CHEN Xin;ZHANG Ying;ZHANG Zhiwei(College of Electronic and Information Engineering,Nanjing University of Aeronautics and Astronautics,Nanjing Jiangsu 210016,China)

机构地区:[1]南京航空航天大学电子信息工程学院,江苏南京210016

出  处:《电子器件》2023年第2期386-390,共5页Chinese Journal of Electron Devices

摘  要:AI加速器在空间探索应用时需要考虑到空间辐射环境下SEE引发的软错误。在AI加速器设计过程中,需要对其SEE容错能力和可靠性进行评估,本文对Lenet-5的加速器进行了SEU故障注入,提出了一种从网络结构与电路模块映射的角度进行统计评估的方法。实验结果证明,在神经网络中,由于AI加速器计算数据大的特点,发生在权重和特征图的SEU错误在传播过程中有可能会被池化层屏蔽掉,SEU错误发生在靠近输出的层级比靠近输入的层级更容易导致识别准确率的下降。此外,实验还发现,在加速器电路模块映射中,负责产生使能信号和地址控制信号的控制单元CTRL比处理单元PE和存储单元MEM更容易被SEU错误所影响,严重时会影响加速器的正常运行。最后本文针对评估结果,进行了STMR加固措施对CTRL进行了加固,相比于FTMR,极大地减少了面积开销。Soft errors caused by SEE in space radiation environment should be considered in the application of AI accelerator in space exploration.In the development of AI accelerator,it is necessary to evaluate the capability and reliability of SEE fault-tolerant.SEU fault injection is performed on lenet-5 accelerator,and a statistical evaluation method is proposed from the perspective of network structure and circuit module.The experimental results prove that in the neural network,due to the characteristics of large computational data of AI accelerator,SEU errors in the weights and feature maps have a certain probability to be masked by the pooling layer in the propagation,and SEU errors occurring at the layer close to the output are more likely to lead to the decline of recognition accuracy than those occur-ring at the level close to the input.In addition,it is found that CTRL,which is responsible for generating enable signal and address con-trol signal,and can affect the normal operation of the accelerator in serious cases,is more easily affected by SEU errors than PE and memory in the circuit module mapping.Finally,based on the evaluation results,STMR reinforcement measures are carried out to rein-force CTRL,greatly reducing the area overhead compared with FTMR.

关 键 词:CNN加速器 Lenet-5 单粒子效应 故障注入 

分 类 号:TP302.8[自动化与计算机技术—计算机系统结构] TP391.9[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象