检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:魏晓辉[1] 王晨洋 吴旗[1] 郑新阳 于洪梅[1] 岳恒山 WEI Xiao-hui;WANG Chen-yang;WU Qi;ZHENG Xin-yang;YU Hong-mei;YUE Heng-shan(College of Computer Science&Technology,Jilin University,Changchun 130012,China)
机构地区:[1]吉林大学计算机科学与技术学院,长春130012
出 处:《吉林大学学报(工学版)》2024年第6期1746-1755,共10页Journal of Jilin University:Engineering and Technology Edition
基 金:国家自然科学基金项目(62272190,U19A2061)。
摘 要:本文根据神经网络本身的错误弹性和层内过滤器相似性提出了一种近似容错设计,把过滤器划分成不同校验组进行不精确校验,保证严重错误被检出并恢复。通过优化过滤器-计算单元映射使校验流程与脉动阵列数据流契合,相较于传统双模冗余,本文提出的容错设计可以降低73.39%的性能开销。To satisfy the massive computational requirement of Convolutional Neural Networks,various Domain-Specific Architecture based accelerators have been deployed in large-scale systems.While improving the performance significantly,the high integration of the accelerator makes it much more susceptible to soft-error,which will be propagated and amplified layer by layer during the execution of CNN,finally disturbing the decision of CNN and leading to catastrophic consequences.CNNs have been increasingly deployed in security-critical areas,requiring more attention to reliable execution.Although the classical fault-tolerant approaches are error-effective,the performance/energy overheads introduced are non-negligible,which is the opposite of CNN accelerator design philosophy.In this article,we leverage CNN's intrinsic tolerance for minor errors and the similarity of filters within a layer to explore the Approximate Fault Tolerance opportunities for CNN accelerator fault tolerance overhead reduction.By gathering the filters into several check groups by clustering to perform an inexact check while ensuring that serious errors are mitigated,our approximate fault tolerance design can reduce fault tolerance overhead significantly.Furthermore,we remap the filters to match the checking process and the dataflow of systolic array,which can satisfy the real-time checking demands of CNN.Experimental results exhibit that our approach can reduce 73.39%performance degradation of baseline DMR.
关 键 词:计算机系统结构 卷积神经网络 脉动阵列 软错误 近似容错
分 类 号:TP302.8[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.191.135.50