基于舍入误差的神经网络量化方法  

Neural network quantization method based on round-error

在线阅读下载全文

作  者:郭秋丹 濮约刚[1] 张启军 丁传红[1] 吴栋 GUO Qiu-dan;PU Yue-gang;ZHANG Qi-jun;DING Chuan-hong;WU Dong(Institute 706,Second Academy of China Aerospace Science and Industry Corporation,Beijing 100854,China)

机构地区:[1]中国航天科工集团第二研究院七〇六所,北京100854

出  处:《计算机工程与设计》2024年第8期2534-2539,共6页Computer Engineering and Design

摘  要:深度神经网络需要付出高昂的计算成本,降低神经网络推理的功耗和延迟,是将神经网络集成到对功耗和计算严格要求的边缘设备上的关键所在。针对这一点,提出一种采用舍入误差的端到端神经网络训练后量化方法,缓解神经网络量化到低比特宽时带来的精度下降问题。该方法只需采用小批量且无标注的数据进行训练,且在不同的神经网络结构上都有十分不错的表现,RegNetX-3.2GF在权重和激活数的比特宽均为4的情况下分类准确率下降不到2%。Deep neural networks often involve high computational costs.Reducing the power consumption and the latency of neural network inference is key to integrating neural networks into edge devices with stringent power and computational requirements.To address this,an end-to-end post-training neural network quantization method was proposed using rounding error to mitigate the accuracy degradation associated with neural network quantization to low bit widths.The method required only small and unlabeled data for training and performed very well on different neural network architectures.RegNetX-3.2GF has less than 2%degradation in classification accuracy with a bit width of 4 for both weights and activations.

关 键 词:模型压缩 网络蒸馏 网络量化 目标识别 感知训练量化 训练后量化 舍入误差 

分 类 号:TP389.1[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象