基于Stable Diffusion的模型窃取攻击方法

Model extraction attack based on Stable Diffusion

作　　者：李若宇冯辉李强季宁宁唐贝贝陈磊 LI Ruoyu;FENG Hui;LI Qiang;JI Ningning;TANG Beibei;CHEN Lei(School of Computer Science,Huainan Normal University,Huainan 232038,China;School of Materials Science and Physics,China University of Mining and Technology,Xuzhou 221116,China;School of Artificial Intelligence,Anhui University of Science and Technology,Huainan 232001,China)

机构地区：[1]淮南师范学院计算机学院,安徽淮南232038 [2]中国矿业大学材料与物理学院,江苏徐州221116 [3]安徽理工大学人工智能学院,安徽淮南232001

出　　处：《哈尔滨商业大学学报(自然科学版)》2025年第2期169-175,共7页Journal of Harbin University of Commerce:Natural Sciences Edition

基　　金：2023年中国高校产学研创新基金(2023IT166);安徽省教育厅哲学社会科学研究重点项目(2023AH051520);全国重点实验室开放课题(COGOS-2023HE02);淮南师范学院质量工程项目(2023hskc54);淮南师范学院重点项目(2023HX106).

摘　　要：针对现有无数据模型窃取攻击技术在有限的查询预算下难以拟合原始训练集分布,进而影响对目标模型决策边界拟合效果问题,提出了一种基于Stable Diffusion的模型窃取攻击方法(Model Extraction Attack Based on Stable Diffusion,MEASD).利用预训练的Stable Diffusion生成训练数据可能涵盖多个域,并包含大量非判别性样本,设计了ILAF方法以优化Stable Diffusion生成的数据品质.将高质量合成数据的原始样本与由对抗样本生成器生成的对抗样本组成替代训练集.由DPA模块组合的替代模型基于替代训练集拟合目标模型的决策边界.实验结果表明,在四个主流的基准数据集上与EBFA和DMEAE方法相比,所提的MEASD方法能够将目标模型决策边界的拟合程度提高至84%,对目标模型的黑盒对抗攻击成功率超过68%,并且查询预算较低.MEASD方法能够有效地提升目标模型决策边界拟合效果及攻击成功率.Aiming to address the problem that existing data-free extraction attack techniques struggled to fit the original training set distribution under a limited query budget,thus affecting the effectiveness of fitting the decision boundary of the target model,model extraction attack based on stable diffusion(MEASD)was proposed.Pre-trained Stable Diffusion was utilized to generate training data that might cover multiple domains and contain a large number of non-discriminative samples.The ILAF method was designed to optimize the quality of data generated by Stable Diffusion.The original samples of high-quality synthetic data were combined with adversarial samples generated by the adversarial sample generator to form an alternative training set.The alternative model assembled by the DPA module fitted the decision boundary of the target model based on the alternative training set.Experimental results demonstrated that the proposed MEASD method improved the fitting degree of the target model s decision boundary to 84%and increased the success rate of black-box adversarial attacks on the target model to more than 68%with a low query budget compared to the EBFA and DMEAE methods on the four mainstream benchmark datasets.The MEASD method effectively enhanced the fitting effect of the target model′s decision boundary and the success rate of the attacks.

关键词：深度学习模型窃取攻击 Stable Diffusion 替代模型对抗攻击对抗训练

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于Stable Diffusion的模型窃取攻击方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于Stable Diffusion的模型窃取攻击方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索