KD-Crowd:a knowledge distillation framework for learning from crowds

作　　者：Shaoyuan LI Yuxiang ZHENG Ye SHI Shengjun HUANG Songcan CHEN

机构地区：[1]College of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,MIIT Key Laboratory of Pattern Analysis and Machine Intelligence,Nanjing 211106,China

出　　处：《Frontiers of Computer Science》2025年第1期119-130,共12页计算机科学前沿(英文版)

基　　金：supported by the National Key R&D Program of China(2022ZD0114801);the National Natural Science Foundation of China(Grant No.61906089);the Jiangsu Province Basic Research Program(BK20190408).

摘　　要：Recently, crowdsourcing has established itself as an efficient labeling solution by distributing tasks to crowd workers. As the workers can make mistakes with diverse expertise, one core learning task is to estimate each worker’s expertise, and aggregate over them to infer the latent true labels. In this paper, we show that as one of the major research directions, the noise transition matrix based worker expertise modeling methods commonly overfit the annotation noise, either due to the oversimplified noise assumption or inaccurate estimation. To solve this problem, we propose a knowledge distillation framework (KD-Crowd) by combining the complementary strength of noise-model-free robust learning techniques and transition matrix based worker expertise modeling. The framework consists of two stages: in Stage 1, a noise-model-free robust student model is trained by treating the prediction of a transition matrix based crowdsourcing teacher model as noisy labels, aiming at correcting the teacher’s mistakes and obtaining better true label predictions;in Stage 2, we switch their roles, retraining a better crowdsourcing model using the crowds’ annotations supervised by the refined true label predictions given by Stage 1. Additionally, we propose one f-mutual information gain (MIG^(f)) based knowledge distillation loss, which finds the maximum information intersection between the student’s and teacher’s prediction. We show in experiments that MIG^(f) achieves obvious improvements compared to the regular KL divergence knowledge distillation loss, which tends to force the student to memorize all information of the teacher’s prediction, including its errors. We conduct extensive experiments showing that, as a universal framework, KD-Crowd substantially improves previous crowdsourcing methods on true label prediction and worker expertise estimation.

关键词：crowdsourcing label noise worker expertise knowledge distillation robust learning

分类号：H31[语言文字—英语]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

KD-Crowd:a knowledge distillation framework for learning from crowds

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

KD-Crowd:a knowledge distillation framework for learning from crowds

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索