检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:陈维伟 王颖[1] 张磊[1] CHEN Weiwei;WANG Ying;ZHANG Lei(Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190;University of Chinese Academy of Sciences,Beijing 100049)
机构地区:[1]中国科学院计算技术研究所,北京100190 [2]中国科学院大学,北京100049
出 处:《高技术通讯》2022年第11期1143-1152,共10页Chinese High Technology Letters
基 金:国家自然科学基金(61902375);中国科学院战略性先导科技专项(C类)(XDC05030201)资助项目。
摘 要:神经网络算法和深度学习加速器已成为推动深度学习方法应用最重要的两股力量,但目前的神经网络结构设计主要围绕模型精度、计算量等指标,忽略了不同模型在目标加速器上计算效率的差异;而加速器设计一般针对既定的神经网络基准程序进行优化,往往难以覆盖到未来不断迭代进化的神经网络模型,这就容易导致加速器在新的网络架构上表现不佳。本质上,神经网络架构与加速器相对独立的设计流程,导致了两者的设计和优化不匹配,从而无法达到最优的深度学习推理性能。为此,本文提出了一种针对图像分类任务的网络结构和加速器软硬件协同设计的框架,将网络结构和加速器设计融合到统一的设计空间中,并针对设计约束,自动搜索最优协同设计方案,实现了端到端的深度学习推理定制和优化。实验表明,在真实的图像分类数据集和脉动阵列架构上,相对于传统的网络结构和加速器分别独立优化的方法,本文提出的协同设计方法实现了平均40%的能耗降低。Neural network architecture and hardware accelerators have been two driving forces for the rapid progress in deep learning.However,previous work has optimized either neural architectures given fixed hardware,or hardware give fixed neural architectures.The design of neural network structure algorithm focuses on the accuracy,and does not take the characteristics of accelerator hardware into consideration.The accelerator design is generally aimed at specific Benchmark and does not support the new network structure,which makes the hardware design lag behind the algorithm update.At the same time,deep learning has a variety of application scenarios,and different scenarios have different software and hardware requirements.Therefore,special design of software and hardware is required for special scenarios,which requires a lot of labor costs and expert knowledge.This paper studies the importance of co-designing neural architectures and hardware accelerators.To this end,an automatic framework that jointly searches for the best configuration for both neural network architecture and accelerator is proposed.This framework combines the network architecture and accelerator design space,then searches the co-design solution given the design constraints automatically,thus providing better performance opportunities than previous approaches that design the network and accelerator separately.The experiments show that,compared with previous method,joint optimization can reduce the average energy consumption by 40%in a real image classification task under the some level of accuracy constraints.
关 键 词:神经网络结构设计 加速器设计 软硬件协同设计 设计空间探索
分 类 号:TP183[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.144.181.40