基于权重复用的训练加速算法

Training acceleration algorithm based on weight reuse

作　　者：应仰威章洛铭齐炜郑楷周泓[1] YING Yangwei;ZHANG Luoming;QI Wei;ZHENG Kai;ZHOU Hong(College of Biomedical Engineering&Instrument Science,Zhejiang University,Hangzhou 310027,China)

机构地区：[1]浙江大学生物医学工程与仪器科学学院,浙江杭州310027

出　　处：《实验技术与管理》2024年第5期15-22,共8页Experimental Technology and Management

基　　金：国家重点研发计划项目(2022YFC3602601);教育部产学合作协同育人项目(220600656141412)。

摘　　要：深度学习已广泛应用在科研教学、工业生产等各领域,但因其数据量庞大和模型结构复杂,在模型训练阶段要依赖大量的计算资源。为了能在实验教学环节提升资源利用效率,让学生更加熟练地掌握数据搜集和模型参数调整优化的能力,提出了一种基于权重复用的训练加速方法,分别对VGG和ResNet网络结构的深度和宽度进行伸缩拓展,允许模型复用结构相似但不需要完全一致的网络权重。实验结果表明,在CIFAR10数据集上测试,采用权重复用方法进行初始化的训练更快收敛,而且在训练结束时与随机化训练的准确率相近,实现拓展后的网络加速训练,是一种更加灵活的知识迁移方法,有助于培养学生对复杂模型的理解与优化能力。[Objective]Deep learning has been widely applied in various fields,including scientific research,teaching,and industrial production.However,due to the large amount of data and complex model structure,it relies on a large amount of computing resources during the model training stage.The knowledge transfer method that reuses the weights of pretrained models has been widely used in the fields of computer vision and natural language processing.For example,when training a detection network on the VOC or COCO dataset,the pretrained classification network of the ImageNet dataset is used as the backbone network to perform further training.On the one hand,reusing the weights trained on similar datasets helps improve the performance of the target task.On the other hand,this can also accelerate the training process.To improve resource utilization efficiency in experimental teaching and to allow students to become more proficient in data collection and model parameter adjustment and optimization,a weight reuse-based training acceleration method is proposed.[Methods]Common weight reuse methods often require a high degree of structural consistency between the pretrained network and the target network,which limits the expansion of the network when exploring a suitable network structure.In this paper,a more flexible method of knowledge transfer is proposed,which allows the network to reuse the weights of the other whose structure is similar but not completely consistent.Training an expanded network using our method is much faster than training from scratch.The algorithm expands the depth and width of the VGG and ResNet network structures,respectively,allowing models to reuse network weights that are similar in structure but not completely consistent.The network exploration scheme of the proposed weight reuse method differs from that of the knowledge distillation-based scheme.It directly transforms the weights of the previously explored network to initialize the new network rather than training from scratch.Due to the lack of gui

关键词：卷积神经网络知识迁移训练加速权重复用

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于权重复用的训练加速算法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于权重复用的训练加速算法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索