稳健选择伪标注的混合式半监督学习被引量：1

Robust pseudo-label selection for holistic semi-supervised learning

作　　者：郭兰哲李宇峰[1] Lanzhe GUO;Yufeng LI(National Key Laboratory for Novel Software Technology,Nanjing University,Nanjing 210023,China)

机构地区：[1]计算机软件新技术国家重点实验室(南京大学),南京210023

出　　处：《中国科学：信息科学》2024年第3期623-637,共15页Scientia Sinica(Informationis)

基　　金：国家自然科学基金(批准号:62176118,61921006);中国人工智能学会-华为MindSpore学术奖励基金资助项目。

摘　　要：半监督学习旨在数据标注缺乏的情形下利用无标注数据提升学习性能,是重要的机器学习范式.尽管不少研究报道表明半监督学习取得了优异的性能表现,然而其在面临诸多实践任务时仍存在伪标注质量判断困难、超参数选择敏感、理论指导缺乏等瓶颈.针对上述挑战,本文提出一种稳健选择伪标注的混合式半监督学习方法,通过综合利用模型预测结果之间的分歧自适应地判断伪标注质量,无需预设超参数,显著提升了半监督学习的稳健性.本文在理论上证明了新方法的错误率随训练轮数的增加而显著下降.实验验证了本文方法较主流技术取得了明显的性能提升,例如,相较于在CIFAR-10数据集中表现最优的半监督学习技术FixMatch,新方法的分类错误率下降了11%以上,在更具挑战的STL-10数据集中分类错误率下降了18.8%.Semi-supervised learning(SSL)is a powerful paradigm for leveraging unlabeled data to mitigate the reliance on large labeled datasets.Although it has been reported that SSL methods achieve significant performance on multiple benchmark datasets,they still have critical limitations when applied to real-world tasks,such as being difficult to determine the quality of pseudo-labels,being sensitive to hyper-parameter choices,lacking theoretical guarantee.To address these issues,we propose a new holistic SSL approach with robust pseudo-label selection.Specifically,our proposal selects pseudo-labels adaptively based on the disagreement of model predictions without pre-defined hyper-parameters.Theoretically,we prove that the classification error decreases with the training iterations.Experimentally,we achieve state-of-the-art performance by a large margin across various datasets.For example,compared with the SOTA SSL algorithm FixMatch,we reduce the error by 11.8%on the CIFAR-10 dataset and 18.8%on the more difficult STL-10 dataset.

关键词：机器学习深度学习半监督学习伪标注稳健性

分类号：TP18[自动化与计算机技术—控制理论与控制工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

稳健选择伪标注的混合式半监督学习被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

稳健选择伪标注的混合式半监督学习 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

稳健选择伪标注的混合式半监督学习被引量：1