遗忘学习综述

Review of Machine Unlearning

作　　者：何黎松杨洋 HE Lisong;YANG Yang(School of Management,Xi'an Jiaotong University,Xi'an 710049,China;School of Finance and Data Science,Xi'an Eurasia University,Xi'an 710065,China)

机构地区：[1]西安交通大学管理学院,西安710049 [2]西安欧亚学院金融与数据科学学院,西安710065

出　　处：《计算机科学与探索》2024年第11期2872-2886,共15页Journal of Frontiers of Computer Science and Technology

摘　　要：为了有效保护数据隐私并实现“被遗忘的权力”,需要从机器学习模型中消除特定训练数据子集的影响,并确保这些数据不会被反向推测。为了解决这一问题,近年来形成了“遗忘学习”的研究领域。从定义、度量方法和算法三个方面全面介绍遗忘学习的研究进展。梳理了遗忘学习的核心概念定义和评价指标,并着重分析了可认证性指标的重要意义。按照算法设计原理将遗忘算法划分为结构化初训练、影响函数近似估计、梯度更新、噪声遗忘、知识蒸馏遗忘和边界遗忘六大类,并详细介绍了其中九种代表性的遗忘学习算法及其演变。在总结比较已有算法优劣基础上,讨论了遗忘学习统一框架的意义,并分析了遗忘学习研究与隐私保护的理论和实践关系。展望了遗忘学习未来的研究方向,包括:机器学习的公平性、迁移学习和强化学习等子领域尚需拓展遗忘学习算法;未来遗忘算法有可能综合多种设计思路;遗忘实践需要技术与法规的协同合作;遗忘学习与增量学习的统一将有助于提高机器学习模型的管理和运营效率。To effectively protect data privacy and implement the“right to be forgotten”,it is necessary to eliminate the influence of specific subsets of training data from machine learning models and ensure that these data cannot be reverse-engineered.To address this issue,the research field of“machine unlearning”has emerged in recent years.This paper reviews the progress in machine unlearning research from three aspects:definitions,metrics,and algorithms.Firstly,it systematically outlines the core concepts,definitions,and evaluation metrics of machine unlearning,emphasizing the critical significance of certifiability metrics.Secondly,it categorizes unlearning algorithms into six major classes based on their design principles:structured initial training,influence functions approximate,gradient updates,noise unlearning,knowledge distillation unlearning,and boundary unlearning.It provides detailed descriptions of nine representative machine unlearning algorithms and their evolution.Based on a comparison of existing algorithms’strengths and weaknesses,this paper discusses the potential and significance of constructing a unified framework for machine unlearning based on certification,and analyzes the theoretical and practical relationships between machine unlearning research and privacy protection.Finally,this paper outlines future research directions for machine unlearning,including the need to extend unlearning algorithms to subfields such as fair machine learning,transfer learning,and reinforcement learning;the potential for integrating various design approaches into future unlearning algorithms;the need for collaboration between technology and regulation in unlearning practices;and the benefits of integrating machine unlearning with incremental learning to improve the management and operation efficiency of machine learning models.

关键词：数据隐私保护遗忘学习可认证性统一框架

分类号：TP181[自动化与计算机技术—控制理论与控制工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

遗忘学习综述

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

遗忘学习综述

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索