面向脉动阵列神经网络加速器的软错误近似容错设计  

Systolic array-based CNN accelerator soft error approximate fault tolerance design

在线阅读下载全文

作  者:魏晓辉[1] 王晨洋 吴旗[1] 郑新阳 于洪梅[1] 岳恒山 WEI Xiao-hui;WANG Chen-yang;WU Qi;ZHENG Xin-yang;YU Hong-mei;YUE Heng-shan(College of Computer Science&Technology,Jilin University,Changchun 130012,China)

机构地区:[1]吉林大学计算机科学与技术学院,长春130012

出  处:《吉林大学学报(工学版)》2024年第6期1746-1755,共10页Journal of Jilin University:Engineering and Technology Edition

基  金:国家自然科学基金项目(62272190,U19A2061)。

摘  要:本文根据神经网络本身的错误弹性和层内过滤器相似性提出了一种近似容错设计,把过滤器划分成不同校验组进行不精确校验,保证严重错误被检出并恢复。通过优化过滤器-计算单元映射使校验流程与脉动阵列数据流契合,相较于传统双模冗余,本文提出的容错设计可以降低73.39%的性能开销。To satisfy the massive computational requirement of Convolutional Neural Networks,various Domain-Specific Architecture based accelerators have been deployed in large-scale systems.While improving the performance significantly,the high integration of the accelerator makes it much more susceptible to soft-error,which will be propagated and amplified layer by layer during the execution of CNN,finally disturbing the decision of CNN and leading to catastrophic consequences.CNNs have been increasingly deployed in security-critical areas,requiring more attention to reliable execution.Although the classical fault-tolerant approaches are error-effective,the performance/energy overheads introduced are non-negligible,which is the opposite of CNN accelerator design philosophy.In this article,we leverage CNN's intrinsic tolerance for minor errors and the similarity of filters within a layer to explore the Approximate Fault Tolerance opportunities for CNN accelerator fault tolerance overhead reduction.By gathering the filters into several check groups by clustering to perform an inexact check while ensuring that serious errors are mitigated,our approximate fault tolerance design can reduce fault tolerance overhead significantly.Furthermore,we remap the filters to match the checking process and the dataflow of systolic array,which can satisfy the real-time checking demands of CNN.Experimental results exhibit that our approach can reduce 73.39%performance degradation of baseline DMR.

关 键 词:计算机系统结构 卷积神经网络 脉动阵列 软错误 近似容错 

分 类 号:TP302.8[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象