机构地区:[1]南京航空航天大学计算机科学与技术学院,江苏南京211106
出 处:《软件学报》2024年第12期5544-5557,共14页Journal of Software
基 金:国家自然科学基金(62076124);南京航空航天大学研究生科研与实践创新计划(xcxjh20221601)。
摘 要:在线类增量连续学习旨在数据流场景下进行有效的新类学习,并保证模型满足小缓存和小批次约束.然而由于数据流的单趟(one-pass)特性,小批次内的类别信息难以如离线学习那样被多趟探索利用.为缓解该问题,目前常采用数据多重增广并借助对比回放建模.但考虑到小缓存和小批次限制,现有随机选择和保存数据的策略不利于获取多样性的负样本,制约了模型判别性.已有研究表明困难负样本是提升对比学习性能的关键,但这鲜少在在线学习场景被探索.Universum学习提出的概念含混(condued)数据恰好提供一种生成困难负样本的简单直观思路,据此先前用特定系数插值混合(mixup)诱导出的Universum数据(mixup-induced Universum,MIU)已有效提升了离线对比学习的性能.受此启发,尝试将其引入在线场景.但不同于先前静态生成的Universum,数据流场景面临着某些额外挑战.首先随类数的动态增加,相对基于全局给定类生成的静态Universum不再适用,需重新加以定义和动态生成,为此提出仅利用当前数据(局部)递归生成相对已见类熵为最大的MIU(称为增量MIU,IMIU),并为其提供额外的小缓存从总体上满足内存限制;其次将生成的IMIU和小批次内的正样本再次插值混合出多样且高质的困难负样本.最后综合上述各步,发展出基于IMIU的在线类增量对比学习(incrementally mixup-induced Universum based online class-increment contrastive learning,IUCL)学习算法.在标准数据集CIFAR-10、CIFAR-100和Mini-ImageNet上的对比实验验证所提算法一致的有效性.Online class-increment learning aims to learn new classes effectively under data stream scenarios and guarantee that the model meets the small cache and small batch constraints.However,due to the one-pass nature of data streams,it is difficult for the category information in small batches like offline learning to be exploited by multiple explorations.To alleviate this problem,current studies adopt multiple data augmentation combined with contrastive learning for model training.Nevertheless,considering the limitations of small cache and small batches,existing methods of selecting and storing data randomly are not conducive to obtaining diverse negative samples,which restricts the model discriminability.Previous studies have shown that hard negative samples are the key to improving contrastive learning performance,but this is rarely explored in online learning scenarios.The condued data proposed in traditional Universum learning provides a simple yet intuitive strategy using hard negative samples.Specifically,this study has proposed mixup-induced Universum(MIU)with certain coefficients previously,which effectively improves the performance of offline contrastive learning.Inspired by this,it tries to introduce MIU to online scenes,which is different from the previously statically generated Universum,and data stream scenarios face some additional challenges.Firstly,due to the increasing number of classes,the conventional approach of generating Universum based on globally given classes statically becomes inapplicable,necessitating redefinition and dynamic generation.Therefore,this study proposes to recursively generate MIU with the maximum entropy(incremental MIU,IMIU)relative to the seen(local)class and provides it with an additional small cache to meet the memory limit generally.Secondly,the generated IMIU and positive samples in small batches are mixed up together again to produce diverse and high-quality hard negative samples.Finally,by combining the above steps,the IMIU-based contrastive learning(IUCL)algorithm is
关 键 词:机器学习 在线类增量学习 对比学习 插值混合 Universum
分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...