机构地区:[1]河北工业大学人工智能与数据科学学院,天津300401 [2]河北省大数据计算重点实验室(河北工业大学),天津300401 [3]河北省数据驱动工业智能工程研究中心(河北工业大学),天津300401
出 处:《中国图象图形学报》2024年第12期3684-3698,共15页Journal of Image and Graphics
基 金:国家自然科学基金项目(62306103,62376194);河北省高等学校自然科学研究项目(QN2023262)。
摘 要:目的深度神经网络在计算机视觉分类任务上表现出优秀的性能,然而,在标签噪声环境下,深度学习模型面临着严峻的考验。基于协同学习(co-teaching)的学习算法能够有效缓解神经网络对噪声标签数据的学习问题,但仍然存在许多不足之处。为此,提出了一种协同学习中考虑历史信息的标签噪声鲁棒学习方法(Co-history)。方法首先,针对在噪声标签环境下使用交叉熵损失函数(cross entropy,CE)存在的过拟合问题,通过分析样本损失的历史规律,提出了修正损失函数,在模型训练时减弱CE损失带来的过拟合带来的影响。其次,针对co-teaching算法中两个网络存在过早收敛的问题,提出差异损失函数,在训练过程中保持两个网络的差异性。最后,遵循小损失选择策略,通过结合样本历史损失,提出了新的样本选择方法,可以更加精准地选择干净样本。结果在4个模拟噪声数据集F-MNIST(Fashion-mixed National Institute of Standards and Technology)、SVHN(street view house number)、CIFAR-10(Canadian Institute for Advanced Research-10)和CIFAR-100和一个真实数据集Clothing1M上进行对比实验。其中,在F-MNIST、SVHN、CIFAR-10、CIFAR-100,对称噪声(symmetric)40%噪声率下,对比co-teaching算法,本文方法分别提高了3.52%、4.77%、6.16%和6.96%;在真实数据集Clothing1M下,对比co-teaching算法,本文方法的最佳准确率和最后准确率分别提高了0.94%和1.2%。结论本文提出的协同学习下考虑历史损失的带噪声标签鲁棒分类算法,经过大量实验论证,可以有效降低噪声标签带来的影响,提高模型分类准确率。Objective Deep neural networks(DNNs) have been successfully applied in many fields,especially in computer vision,which cannot be achieved without large-scale labeled datasets.However,collecting large-scale datasets with accurate labels is difficult in practice,especially in some professional fields.The labeling of these datasets requires the involvement of relevant experts,thus increasing manpower and financial resources.To cut costs,researchers have started using datasets built by crowdsourcing annotations,search engine queries,and web crawling,among others.However,these datasets inevitably contain noisy labels that seriously affect the generalization of DNNs because DNNs memorize these noise labels during training.Learning algorithms based on co-teaching methods,including Co-teaching+,JoCoR,and CoDis,can effectively alleviate the learning problem of neural networks on noisy label data.Scholars have put forward different opinions regarding the use of two networks to solve noisy labels.However,in a noisy label environment,the deep learning model based on CE loss is very sensitive to the noisy label,thus making the model easily fit the noisy label sample and unable to learn the real pattern of the data.With the progress of training,Co-teaching causes the parameters of the two networks to gradually become consistent and prematurely converge to the same network,thus stopping the learning process.As the iteration progresses,the network inevitably remembers some of the noisy label samples and thus failing to distinguish the noisy from the clean samples accurately based on the cross entropy(CE) loss value.In this case,relying solely on CE loss as a small loss selection strategy is not reliable.To solve these problems,this paper proposes learning with noisy labels by co-teaching with history losses(Co-history) that considers historical information in collaborative learning.Method First,to solve the overfitting problem of cross entropy loss(CE) in a noisy label environment,a correction loss is proposed by analyzing the h
关 键 词:深度神经网络(DNN) 分类 噪声标签 协同学习 历史损失
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...