基于原型网络的中文分类模型对抗样本生成被引量：2

Adversarial Sample Generation for Chinese Classification Model Based on Prototypical Network

作　　者：杨燕燕[1] 谢明轩曹江峡王学宾[2] 柳厅文[2,3] 杜彦辉 YANG Yanyan;XIE Mingxuan;CAO Jiangxia;WANG Xuebin;LIU Tingwen;DU Yanhui(College of Information and Cyber Security,People's Public Security University of China,Beijing 100038,China;Institute of Information Engineering,Chinese Academy of Sciences,Beijing 100084,China;School of Cyber Security,University of Chinese Academy of Sciences,Beijing 100049,China)

机构地区：[1]中国人民公安大学信息网络安全学院,北京100038 [2]中国科学院信息工程研究所,北京100084 [3]中国科学院大学网络空间安全学院,北京100049

出　　处：《计算机工程》2023年第8期54-62,共9页Computer Engineering

基　　金：国家重点研发计划(2021YFB3100600);中国科学院战略性先导科技专项(XDC02040400);中国科学院青年创新促进会项目(2021153)。

摘　　要：对抗样本生成通过在原文本中添加不易察觉的扰动使深度学习模型产生错误输出,常用于检测文本分类模型的鲁棒性。现有对抗样本生成方法多数采用黑盒或白盒攻击,在生成对抗样本的过程中需要和受害模型交互,且攻击效果依赖于受害模型的结构和性能,通用性较差。面向中文文本的对抗样本生成方法使用的变换策略过于单一,无法生成多样化的中文对抗样本。针对这些问题,提出一种基于原型网络的对抗样本生成(AEGP)方法。在全面分析汉字结构特点和人类阅读习惯的基础上,设计8种可保持语义一致的中文文本变换策略。将卷积神经网络作为编码器,构建原型网络,利用同一类别下的其他文本辅助发现所需变换的文本片段。针对选择的文本片段应用文本变换策略,生成对抗样本。实验结果表明,AEGP方法具有较好的通用性,能生成多样化的对抗样本,且相比于基线方法,分类模型在AEGP方法生成的对抗样本上的准确率下降了9.21~32.64个百分点。In adversarial sample generation,the deep learning model is triggered to add imperceptible perturbations to the original text,thereby producing an incorrect output which can subsequently be used to test the robustness of the text classification model against malicious attacks.Existing adversarial sample generation methods must interact with the victim model in launching mostly black-or white-box attacks.The effect of the attack depends on the attributes of the victim model,such as structure and performance,and thus the process is not sufficiently versatile.In addition,the transformation strategy used in the adversarial sample generation method for Chinese text is too simple to generate diverse adversarial examples.To address these issues,in this study,an adversarial sample generation method called AEGP is proposed for a Chinese text classification model.First,based on a comprehensive analysis of the structural characteristics of Chinese characters and human reading habits,eight Chinese text transformation strategies are designed to maintain consistent semantics.Subsequently,using convolutional neural networks as the encoder,a prototypical network is built,whereby other texts in the same category are used to determine the text fragments that need to be transformed.Finally,text transformation strategies are applied to the selected text fragments to generate adversarial samples.The experimental results demonstrate that AEGP has good generality in generating diverse adversarial samples.Compared with the baseline method,the accuracy of the classification model on the adversarial samples generated by AEGP dropped by 9.21-32.64 percentage points,demonstrating the sensitivity of the model to imperceptible perturbations.

关键词：对抗样本生成分类模型原型网络文本表示变换策略

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于原型网络的中文分类模型对抗样本生成被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于原型网络的中文分类模型对抗样本生成 被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于原型网络的中文分类模型对抗样本生成被引量：2