PRAP-PIM:A weight pattern reusing aware pruning method for ReRAM-based PIM DNN accelerators  

在线阅读下载全文

作  者:Zhaoyan Shen Jinhao Wu Xikun Jiang Yuhao Zhang Lei Ju Zhiping Jia 

机构地区:[1]School of Computer Science and Technology,Shandong University,Qindao 266237,China [2]School of Cyber Science and Technology,Shandong University,Qindao 266237,China [3]Department of Computer Science and Technology,Tsinghua University,Beijing 100084,China

出  处:《High-Confidence Computing》2023年第2期50-59,共10页高置信计算(英文)

基  金:partially supported by the National Natural Science Foundation of China(92064008);the CCF-Huawei Huyanglin Project CCF-HuaweiST2021002;the Open Project Program of Wuhan National Laboratory for Optoelectronics(2022WNLOKF018);the Shandong Provincial Natural Science Foundation(ZR2022LZH010).

摘  要:Resistive Random-Access Memory(ReRAM)based Processing-in-Memory(PIM)frameworks are proposed to accelerate the working process of DNN models by eliminating the data movement between the computing and memory units.To further mitigate the space and energy consumption,DNN model weight sparsity and weight pattern repetition are exploited to optimize these ReRAM-based accelerators.However,most of these works only focus on one aspect of this software/hardware codesign framework and optimize them individually,which makes the design far from optimal.In this paper,we propose PRAP-PIM,which jointly exploits the weight sparsity and weight pattern repetition by using a weight pattern reusing aware pruning method.By relaxing the weight pattern reusing precondition,we propose a similarity-based weight pattern reusing method that can achieve a higher weight pattern reusing ratio.Experimental results show that PRAP-PIM achieves 1.64×performance improvement and 1.51×energy efficiency improvement in popular deep learning benchmarks,compared with the state-of-the-art ReRAM-based DNN accelerators.

关 键 词:Resistive Random-Access Memory Processing-in-Memory Deep neural network Model compression 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象