检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:吴致远 齐红[1,3] 姜宇[1,3] 崔楚朋 杨宗敏 薛欣慧 WU Zhiyuan;QI Hong;JIANG Yu;CUI Chupeng;YANG Zongmin;XUE Xinhui(College of Computer Science and Technology,Jilin University,Changchun 130012,China;Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190,China;Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education,Jilin University,Changchun 130012,China)
机构地区:[1]吉林大学计算机科学与技术学院,长春130012 [2]中国科学院计算技术研究所,北京100190 [3]吉林大学符号计算与知识工程教育部重点实验室,长春130012
出 处:《吉林大学学报(理学版)》2022年第4期881-888,共8页Journal of Jilin University:Science Edition
基 金:国家自然科学基金(批准号:U20A20285,62072211,51939003).
摘 要:针对嵌入式和移动设备的计算和存储资源受限,紧凑型网络优化易收敛至较差局部最优解的问题,提出一个特征图自适应知识蒸馏模型,其由特征图适配器和特征图自适应知识蒸馏策略构成.首先,特征图适配器通过异构卷积与视觉特征表达模块的堆叠实现特征图尺寸匹配、教师学生网络特征同步变换及自适应语义信息匹配.其次,特征图自适应知识蒸馏策略将适配器嵌入教师网络对其进行重构,并在训练过程中实现适合用于学生网络隐藏层监督特征的自适应搜索;利用适配器前部输出提示学生网络前部训练,实现教师到学生网络的知识迁移,并在学习率约束条件下进一步优化.最后,在图像分类任务数据集cifar-10上进行实验验证,结果表明,特征图自适应知识蒸馏模型分类正确率提高0.6%,推断损失降低6.5%,并将收敛至78.2%正确率的时间减少至未迁移时的5.6%.Aiming at the problem that computational and storage resources of embedded and mobile devices were limited,and the compact network optimization was easy to converge to poor local optimal solutions,we proposed an activation map adaptation model for knowledge distillation,which was composed of an activation map adapter and an activation map adaptation knowledge distillation strategy.Firstly,the activation map adapter realized activation map size matching,synchronous transformation of teacher-student network features,and adaptive semantic information matching by heterogeneous convolution and stacking of visual feature expression modules.Secondly,the activation map adaptation knowledge distillation strategy embedded the adapter into the teacher network to reconstruct it,and realized adaptively search suitable for the supervision features of the hidden layer of the student network during training process,the front output of the adapter was used to prompt the front training of the student network,so as to realize knowledge transfer from the teacher to the student network,and further optimize it under the constraint of learning rate.Finally,experimental verification was carried out on the image classification task dataset cifar-10.The results show that the classification accuracy of the activation map adaptive knowledge distillation model is improved by 0.6%,the inference loss is reduced by 6.5%,and the time to converge to 78.2%accuracy is reduced to 5.6%when it is not migrated.
关 键 词:人工智能 知识蒸馏 特征图自适应 模型迁移 图像分类
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222