检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:郑明钰 林政[1,2] 刘正宵 付鹏 王伟平[1] Zheng Mingyu;Lin Zheng;Liu Zhengxiao;Fu Peng;Wang Weiping(Institute of Information Engineering,Chinese Academy of Sciences,Beijing 100093;School of Cyber Security,University of Chinese Academy of Sciences,Beijing 100049)
机构地区:[1]中国科学院信息工程研究所,北京100093 [2]中国科学院大学网络空间安全学院,北京100049
出 处:《计算机研究与发展》2024年第1期221-242,共22页Journal of Computer Research and Development
基 金:国家自然科学基金项目(61976207,61906187)。
摘 要:深度神经网络的安全性和鲁棒性是深度学习领域的研究热点.以往工作主要从对抗攻击角度揭示神经网络的脆弱性,即通过构建对抗样本来破坏模型性能并探究如何进行防御.但随着预训练模型的广泛应用,出现了一种针对神经网络尤其是预训练模型的新型攻击方式——后门攻击.后门攻击向神经网络注入隐藏的后门,使其在处理包含触发器(攻击者预先定义的图案或文本等)的带毒样本时会产生攻击者指定的输出.目前文本领域已有大量对抗攻击与防御的研究,但对后门攻击与防御的研究尚不充分,缺乏系统性的综述.全面介绍文本领域后门攻击和防御技术.首先,介绍文本领域后门攻击基本流程,并从不同角度对文本领域后门攻击和防御方法进行分类,介绍代表性工作并分析其优缺点;之后,列举常用数据集以及评价指标,将后门攻击与对抗攻击、数据投毒2种相关安全威胁进行比较;最后,讨论文本领域后门攻击和防御面临的挑战,展望该新兴领域的未来研究方向.In the deep learning community,lots of efforts have been made to enhance the robustness and the reliability of deep neural networks(DNNs).Previous research mainly analyzed the fragility of DNN from the perspective of adversarial attack,and researchers designed numerous adversarial attack and defense methods.However,with the wide application of pre-trained models(PTMs),a new security threat against DNN especially PTM,called backdoor attack is emerging.Backdoor attack aims at injecting hidden backdoors into DNN,such that the backdoored model behaves properly on normal inputs but produces attacker-specified malicious outputs on the poisoned inputs embedded with special triggers.Backdoor attack poses a severe threat against DNN based systems like spam filter or hate speech detector.Compared with the textual adversarial attack and defense which has been widely studied,textual backdoor attack and defense has not been thoroughly investigated and requires a systematic review.In this paper,we present a comprehensive survey of backdoor attack and defense methods in the text domain.Specifically,we first summarize and categorize the textual backdoor attack and defense methods from different perspectives,then we introduce typical work and analyze their pros and cons.We also enumerate widely adopted benchmark datasets and evaluation metrics in the current literatures.Moreover,we respectively compare the backdoor attack with two relevant threats(i.e.,adversarial attack and data poisoning).Finally,we discuss existing challenges of backdoor attack and defense in the text domain and present several promising future directions in this emerging and rapidly growing research area.
关 键 词:后门攻击 后门防御 自然语言处理 预训练模型 AI安全
分 类 号:TP309.2[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222