机器遗忘综述

Survey on Machine Unlearning

作　　者：李梓童孟小峰[1] 王雷霞郝新丽 LI Zi-Tong;MENG Xiao-Feng;WANG Lei-Xia;HAO Xin-Li(School of Information,Renmin University of China,Beijing 100872,China)

机构地区：[1]中国人民大学信息学院,北京100872

出　　处：《软件学报》2025年第4期1637-1664,共28页Journal of Software

基　　金：国家自然科学基金(61941121,91846204,6217242)。

摘　　要：近年来,机器学习在人们日常生活中应用愈发广泛,这些模型在历史数据上进行训练,预测未来行为,极大地便利了人们生活.然而,机器学习存在隐私泄露隐患:当用户不希望个人数据被使用时,单纯地把其数据从训练集中删去并不够,已训练好的模型仍包含用户信息,可能造成隐私泄露.为了解决这一问题,让机器学习模型“遗忘”该用户个人数据,最简单的方法是在不包含其数据的训练集上重新训练,此时得到的新模型必定不包含个人数据的信息.然而,重新训练往往代价较大,成本较高,由此产生“机器遗忘”的关键问题:能否以更低的代价,获取与重新训练模型尽可能相似的模型.对研究这一问题的文献进行梳理归纳,将已有机器遗忘方法分为基于训练的方法、基于编辑的方法和基于生成的方法这3类,介绍机器遗忘的度量指标,并对已有方法进行测试和评估,最后对机器遗忘作未来展望.Machine learning has become increasingly prevalent in daily life.Various machine learning methods are proposed to utilize historical data for making predictions,making people’s life more convenient.However,there is a significant challenge associated with machine learning-privacy leakage.Mere deletion of a user’s data from the training set is not sufficient for avoiding privacy leakage,as the trained model may still harbor this information.To tackle this challenge,the conventional approach entails retraining the model on a new training set that excludes the data of the user.However,this method can be costly,prompting the exploration for a more efficient way to“unlearn”specific data while yielding a model comparable to a retrained one.This study summarizes the current literature on this topic,categorizing existing unlearning methods into three groups:training-based,editing-based,and generation-based methods.Additionally,various metrics are introduced to assess unlearning methods.The study also evaluates current unlearning methods in deep learning and concludes with future research directions in this field.

关键词：机器学习机器遗忘深度学习隐私保护

分类号：TP18[自动化与计算机技术—控制理论与控制工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

机器遗忘综述

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

机器遗忘综述

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索