基于样本分布特征的数据投毒防御被引量：3

Data poisoning defense based on sample distribution characteristics

作　　者：杨立圣罗文华[1] Yang Lisheng;Luo Wenhua(School of Public Security Information Technology&Intelligence,Criminal Investigation Police University of China,Shenyang 110035,China)

机构地区：[1]中国刑事警察学院公安信息技术与情报学院,沈阳110035

出　　处：《计算机应用研究》2023年第9期2845-2850,共6页Application Research of Computers

基　　金：国家重点研发计划资助项目(2018YFC0830600);中国刑事警察学院研究生创新能力提升项目(2022YCZD05)。

摘　　要：流量分类模型在更新过程中易受数据污染的干扰而降低模型性能,现有基于数据清洗的防御方法需依赖专家经验和人工筛选,且无法有效应对利用未知分布样本构造的投毒攻击。针对上述问题,受分布外检测和判别主动学习的启发,设计一种基于样本分布特征的数据投毒防御方法,通过二分类判别器筛选每轮新增样本中的已知及未知分布样本。对于新增的已知分布样本,通过模型预测与标注结果一致率评估新增样本的数据质量,决定是否进行模型更新;对于新增的未知分布样本,则利用基于标注正确率的少样本抽检评估样本可用性。实验结果表明,该方法在抵御数据投毒攻击的同时可以保证模型准确率,并有效识别利用未知分布样本构造的数据投毒攻击。The traffic classification model is vulnerable to the interference of data pollution in the update process and reduces the performance of the model.The existing defense methods based on data cleaning need to rely on expert experience and ma-nual screening,and cannot effectively deal with the poison attack constructed by using unknown distributed samples.In view of the above problems,inspired by out-of-distribution detection and discrimination active learning,this paper designed a data poisoning prevention method based on sample distribution characteristics,and used the binary classification discriminator to screen out the known and unknown distribution samples in each new round of samples.For the new known distribution samples,it used the concordant rate of prediction and annotation to evaluate the data quality of the new samples and determine whether to update the model.For the new unknown distribution samples,it used the small sample sampling based on the labeling accuracy to evaluate the sample availability.The experimental results show that this method can guarantee the accuracy of the model while resisting the data poisoning attack,and effectively identify the data poisoning attack constructed by using unknown distribution samples.

关键词：AI安全流量分类模型数据投毒攻击样本分布特征

分类号：TP309[自动化与计算机技术—计算机系统结构]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于样本分布特征的数据投毒防御被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于样本分布特征的数据投毒防御 被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于样本分布特征的数据投毒防御被引量：3