检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:杨立圣 罗文华[1] Yang Lisheng;Luo Wenhua(School of Public Security Information Technology&Intelligence,Criminal Investigation Police University of China,Shenyang 110035,China)
机构地区:[1]中国刑事警察学院公安信息技术与情报学院,沈阳110035
出 处:《计算机应用研究》2023年第9期2845-2850,共6页Application Research of Computers
基 金:国家重点研发计划资助项目(2018YFC0830600);中国刑事警察学院研究生创新能力提升项目(2022YCZD05)。
摘 要:流量分类模型在更新过程中易受数据污染的干扰而降低模型性能,现有基于数据清洗的防御方法需依赖专家经验和人工筛选,且无法有效应对利用未知分布样本构造的投毒攻击。针对上述问题,受分布外检测和判别主动学习的启发,设计一种基于样本分布特征的数据投毒防御方法,通过二分类判别器筛选每轮新增样本中的已知及未知分布样本。对于新增的已知分布样本,通过模型预测与标注结果一致率评估新增样本的数据质量,决定是否进行模型更新;对于新增的未知分布样本,则利用基于标注正确率的少样本抽检评估样本可用性。实验结果表明,该方法在抵御数据投毒攻击的同时可以保证模型准确率,并有效识别利用未知分布样本构造的数据投毒攻击。The traffic classification model is vulnerable to the interference of data pollution in the update process and reduces the performance of the model.The existing defense methods based on data cleaning need to rely on expert experience and ma-nual screening,and cannot effectively deal with the poison attack constructed by using unknown distributed samples.In view of the above problems,inspired by out-of-distribution detection and discrimination active learning,this paper designed a data poisoning prevention method based on sample distribution characteristics,and used the binary classification discriminator to screen out the known and unknown distribution samples in each new round of samples.For the new known distribution samples,it used the concordant rate of prediction and annotation to evaluate the data quality of the new samples and determine whether to update the model.For the new unknown distribution samples,it used the small sample sampling based on the labeling accuracy to evaluate the sample availability.The experimental results show that this method can guarantee the accuracy of the model while resisting the data poisoning attack,and effectively identify the data poisoning attack constructed by using unknown distribution samples.
关 键 词:AI安全 流量分类模型 数据投毒攻击 样本分布特征
分 类 号:TP309[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.218.102.138