基于CLIP微调的扩散模型安全化

Purging diffusion models through CLIP based fine-tuning

作　　者：吴平[1] 林欣[1] WU Ping;LIN Xin(School of Computer Science and Technology,East China Normal University,Shanghai 200062,China)

机构地区：[1]华东师范大学计算机科学与技术学院,上海200062

出　　处：《华东师范大学学报(自然科学版)》2025年第1期138-150,共13页Journal of East China Normal University(Natural Science)

基　　金：统计与数据科学前沿理论及应用教育部重点实验室开放项目;上海市科委项目(21511100101)。

摘　　要：扩散模型变革了文本–图像生成领域,使终端用户可以基于简单的自然语言提示生成高质量、多样化的图像艺术作品.然而,由于训练数据集庞大且未经过滤,文本–图像生成模型具有生成色情内容与暴力内容等不适当内容的能力.为更加安全地部署此类模型,提出了一种基于CLIP (contrastive languageimage pre-training)方向性损失的微调(directional CLIP loss based fine-tuning, CLIF)算法,使用方向性的CLIP损失来微调模型,以抑制其生成不适当内容的能力. CLIF消耗的计算资源很少,并且具有强制生效的特点.为评估其抑制效果,提出了CTP (categorized toxic prompts)用于评估文本–图像生成模型的不适当内容生成能力.在CTP与COCO (common objects in context)上的实验结果表明, CLIF能够在抑制文本–图像扩散模型生成不安全内容的同时不影响其一般性生成能力.Diffusion models have revolutionized text-to-image synthesis,enabling users to generate highquality and imaginative artworks from simple natural-language text prompts.Unfortunately,due to the large and unfiltered training dataset,inappropriate content such as nudity and violence can be generated from them.To deploy such models at a higher level of safety,we propose a novel method,directional contrastive language-image pre-training(CLIP)loss-based fine-tuning,dubbed as CLIF.This method utilizes directional CLIP loss to suppress the model’s inappropriate generation ability.CLIF is lightweight and immune to circumvention.To demonstrate the effectiveness of CLIF,we proposed a benchmark called categorized toxic prompts(CTP)to evaluate the ability to generate inappropriate content for text-to-image diffusion models.As shown by our experiments on CTP and common objects in context(COCO)datasets,CLIF is capable of significantly suppressing inappropriate generation while preserving the model’s ability to produce general content.

关键词：文本–图像生成模型安全性数据集扩散模型

分类号：TP391.4[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于CLIP微调的扩散模型安全化

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于CLIP微调的扩散模型安全化

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索