基于负学习的样本重加权鲁棒学习方法

Robust learning method by reweighting examples with negative learning

作　　者：邹博士杨铭宗辰辰谢明昆黄圣君[1] ZOU Boshi;YANG Ming;ZONG Chenchen;XIE Mingkun;HUANG Shengjun(College of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing Jiangsu 211106,China)

机构地区：[1]南京航空航天大学计算机科学与技术学院,南京211106

出　　处：《计算机应用》2024年第5期1479-1484,共6页journal of Computer Applications

摘　　要：噪声标记学习方法能够有效利用含有噪声标记的数据训练模型,显著降低大规模数据集的标注成本。现有的噪声标记学习方法通常假设数据集中各个类别的样本数目是平衡的,但许多真实场景下的数据往往存在噪声标记,且数据的真实分布具有长尾现象,这导致现有方法难以设计有效的指标,如训练损失或置信度区分尾部类别中的干净样本和噪声样本。为了解决噪声长尾学习问题,提出一种基于负学习的样本重加权鲁棒学习(NLRW)方法。具体来说,根据模型对头部类别和尾部类别样本的输出分布,提出一种新的样本权重计算方法,能够使干净样本的权重接近1,噪声样本的权重接近0。为了保证模型对样本的输出准确,结合负学习和交叉熵损失使用样本加权的损失函数训练模型。实验结果表明,在多种不平衡率和噪声率的CIFAR-10以及CIFAR-100数据集上,NLRW方法相较于噪声长尾分类的最优基线模型TBSS(Two stage Bi-dimensional Sample Selection),平均准确率分别提升4.79%和3.46%。Noisy label learning methods can effectively use data containing noisy labels to train models and significantly reduce the labeling cost of large-scale datasets.Most existing noisy label learning methods usually assume that the number of each class in the dataset is balanced,but the data in many real-world scenarios tend to have noisy labels,and long-tailed distributions often present in the dataset simultaneously,making it difficult for existing methods to select clean examples from noisy examples in the tail class according to traning loss or confidence.To solve noisy long-tailed learning problem,a ReWeighting examples with Negative Learning(NLRW)method was proposed,by which examples were reweighted adaptively based on negative learning.Specifically,at each training epoch,the weights of examples were calculated according to the output distributions of the model to head classes and tail classes.The weights of clean examples were close to one while the weights of noisy examples were close to zero.To ensure accurate estimation of weights,negative learning and cross entropy loss were combined to train the model with a weighted loss function.Experimental results on CIFAR-10 and CIFAR-100 datasets with various imbalance rates and noise rates show that,compared with the optimal baseline model TBSS(Two stage Bi-dimensional Sample Selection)for noisy long-tail classification,NLRW method improves the average accuracy by 4.79%and 3.46%,respectively.

关键词：噪声标记学习长尾学习噪声长尾学习样本重加权负学习

分类号：TP391.4[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于负学习的样本重加权鲁棒学习方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于负学习的样本重加权鲁棒学习方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索