检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:吕浙帆 王天成[1] 李华伟[1,2] Lyu Zhefan;Wang Tiancheng;Li Huawei(State Key Lab of Processors,Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190;School of Computer Science and Technology,University of Chinese Academy of Sciences,Beijing 100049)
机构地区:[1]中国科学院计算技术研究所处理器芯片全国重点实验室,北京100190 [2]中国科学院大学计算机科学与技术学院,北京100049
出 处:《计算机辅助设计与图形学学报》2023年第11期1789-1801,共13页Journal of Computer-Aided Design & Computer Graphics
基 金:国家重点研发计划(2020YFB1600201);国家自然科学基金(62090024)。
摘 要:与业界常用的双核锁步方法相比,异构并行差错检测技术以较小的面积开销实现接近的差错覆盖率,但是会增加差错检测延时并影响主核的性能.针对差错检测不及时带来的潜在安全风险,提出一种低延时的异构并行差错检测方法.首先通过复制寄存器时暂停物理寄存器释放的策略降低复制寄存器对主核性能的影响;然后利用主核控制流指导检查核取指,并基于预测检查核运行时间来划分程序段,以提升差错检测的性能,使得最大差错检测延时可控.使用1个开源香山处理器核作为主核,16个开源Rocket处理器作为检查核进行了方法实现,采用基准程序评估的实验结果表明,所提方法能够以50%的逻辑开销和22%的存储开销实现差错检测,小于双核锁步接近100%的面积开销.同时,在主核上的平均性能开销小于1%,且能将差错检测延迟控制在2000个时钟周期以内.此外,与原有分支预测策略相比,检查核的平均性能提升了14.9%.Compared to the Dual-Core Lock-Step technique commonly used in industry,heterogeneous parallel error detection techniques using heterogeneous cores could achieve similar error coverage with smaller area overhead,at the cost of worse error detection latency and affect the performance degradation of the main core.To avoid potential security safety risks caused by errors not detected in time,a low-latency heterogeneous architecturally parallel error detection method is proposed.First,the impact on the main core’s performance is reduced by stalling the release of physical registers while copying data of the registers.Second,to improve the performance of checker cores,the main core’s control flow is used to guide the instruction fetch of the checker cores,and the program segments are divided by predicting their running time in checker cores so that the maximum error detection latency can be controlled.The proposed method was implemented using the open-source XiangShan processor as the main core,and 16 Rocket processors as the checker cores.The experimental results on benchmark programs show that,error detection can be efficiently achieved with 50%logic area overhead and 22%storage area overhead,which is significantly less than the nearly 100%area overhead of the dual-core lockstep technique,while the average performance overhead on the main core is less than 1%,and the error detection latency can be effectively controlled within 2000 clock cycles.Moreover,the average performance of the checker cores has been improved by 14.9%in comparison to the original branch prediction strategy.
分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.198