影响预训练语言模型数据泄露的因素研究

Exploring Effective Factors Leading to Data Leakage in Pre-trained Language Models

作　　者：钱汉伟彭季天袁明高光亮刘晓迁王群[1] 朱景羽 Qian Hanwei;Peng Jitian;Yuan Ming;Gao Guangliang;Liu Xiaoqian;Wang Qun;Zhu Jingyu(Department of Computer Information and Cyber Security,Jiangsu Police Institute,Nanjing 210013;The State Key Laboratory for Novel Software Technology(Nanjing University),Nanjing 210093)

机构地区：[1]江苏警官学院计算机信息与网络安全系,南京210013 [2]计算机软件新技术国家重点实验室(南京大学),南京210093

出　　处：《信息安全研究》2025年第2期181-188,共8页Journal of Information Security Research

基　　金：国家自然科学基金项目(72401110);江苏省高校哲学社会科学研究项目(2024SJYB0345);2023年江苏高校“青蓝工程”优秀青年骨干教师项目。

摘　　要：当前广泛使用的预训练语言模型是从海量训练语料中学习通用的语言表示.自然语言处理领域的下游任务在使用预训练语言模型后性能得到显著提升,但是深度神经网络过拟合现象使得预训练语言模型可能存在泄露训练语料隐私的风险.选用T5,GPT-2,OPT等广泛使用的预训练语言模型作为研究对象,利用模型反演攻击探索影响预训练语言模型数据泄露的因素.实验过程中利用预训练语言模型生成大量样本,以困惑度等指标选取最有可能发生数据泄露风险的样本进行验证,证明了T5等不同模型均存在不同程度的数据泄露问题;同一种模型,模型规模越大数据泄露可能性越大;添加特定前缀更容易获取泄露数据等问题.对未来数据泄露问题及其防御方法进行了展望.Currently,pre-trained language models are widely used to learn general language representations from massive training corpora.The performance of downstream tasks in the field of natural language processing has been significantly improved after using the pre-trained language model,but the over-fitting phenomenon of the deep neural network makes the pre-trained language model may have the risk of leaking the privacy of the training corpus.This paper selects T5,GPT,OPT and other widely used pre-trained language models as research objects,and uses model inversion attacks to explore the factors that affect the data leakage of pre-trained language models.During the experiment,the pre-trained language model was used to generate a large number of samples,and the samples most likely to cause data leakage risk were selected for verification by indicators such as perplexity.It proved that different models such as T5 have different degrees of data leakage problems.For the same model,the larger size of the model,the scale,the greater the possibility of data leakage;adding a specific prefix makes it easier to obtain leaked data.The future data leakage problem and its defense methods are prospected.

关键词：自然语言处理预训练语言模型隐私数据泄露模型反演攻击模型架构

分类号：TP183[自动化与计算机技术—控制理论与控制工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

影响预训练语言模型数据泄露的因素研究

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

影响预训练语言模型数据泄露的因素研究

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索