基于分歧的核心数据集筛选算法  

An Efficient Core-set Selection Algorithm Based on Difference

在线阅读下载全文

作  者:王纵驰 刘健 王培 赵兴博 于佳耕[4] 陶青川[3] WANG Zongchi;LIU Jian;WANG Pei;ZHAO Xingbo;YU Jiageng;TAO Qingchuan(China National Aviation Fuel Group Limited,Beijing 100088;Aerospace Shenzhou Intelligent System Technology Co.,Ltd.,Beijing 100029;College of Electronics Information and Engineering,Sichuan University,Chengdu 610065;Institute of Software,Chinese Academy of Sciences,Beijing 100190)

机构地区:[1]中国航空油料集团有限公司,北京100088 [2]航天神舟智慧系统技术有限公司,北京100029 [3]四川大学电子信息学院,成都610065 [4]中国科学院软件研究所,北京100190

出  处:《计算机与数字工程》2024年第5期1304-1309,1316,共7页Computer & Digital Engineering

摘  要:随着深度学习的发展,运用于训练的数据集规模日益增大,导致深度神经网络训练的效率低下。针对这种情况,提出了基于分歧的核心数据集筛选算法,即在保证训练效果的情况下对原数据集进行精简得出核心数据集。算法使用迭代的方式以有监督学习方式进行学习,通过投票网络框架计算各数据的分歧值并以此排序进行筛选。对广泛使用的CI-FAR、Fashion-MNIST以及SVHN数据集进行核心数据集筛选实验,结果表明所提出的算法在得到核心数据集规模为原始规模五分之一的同时,其训练模型的精度仅下降不超过5%。同时,其筛选出的核心数据集的泛化误差仅为0.13,其泛用性更佳。With the development of deep learning,the scale of datasets is accumulating at an unprecedented speed,the pro-cess of training is inefficiency.It is usually necessary to simplify the original data set while ensuring similar training effect.In view of this,a core-set selection algorithm based on divergence is proposed.The algorithm uses the iterative method to learn in a supervised learning way,and calculates the divergence values of each data through the voting network framework,and then sorts them to select.The core-set selection experiments on CIFAR,Fashion-MNIST and SVHN datasets are carried out.The results show that the pro-posed algorithm can obtain a core-set size of one fifth of the original size,while the accuracy of the training model is only reduced by less than 5%.At the same time,the generalization error of the core dataset is only 0.13,which makes it more universal.

关 键 词:卷积神经网络 核心数据集筛选 有监督学习 主动学习 

分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象