基于单“音频像素”扰动的说话人识别隐蔽攻击被引量：3

Stealthy Attack Towards Speaker Recognition Based on One-“Audio Pixel”Perturbation

作　　者：沈轶杰李良澄刘子威刘天天罗浩[1] 沈汀林峰[1,2] 任奎[1] Shen Yijie;Li Liangcheng;Liu Ziwei;Liu Tiantian;Luo Hao;Shen Ting;Lin Feng;Ren Kui(Institute of Cyberspace Research,Zhejiang University,Hangzhou 310027;Key Laboratory of Blockchain and Cyberspace Governance of Zhejiang Province(Zhejiang University),Hangzhou 310027;Zhejiang Dong an Testing Technology Co.,Ltd.,Hangzhou 310063)

机构地区：[1]浙江大学网络空间安全研究中心,杭州310027 [2]浙江省区块链与网络空间治理重点实验室(浙江大学),杭州310027 [3]浙江东安检测技术有限公司,杭州310063

出　　处：《计算机研究与发展》2021年第11期2350-2363,共14页Journal of Computer Research and Development

基　　金：国家重点研发计划项目(2020AAA0107700);国家自然科学基金项目(62032021,61772236,61972348);浙江省重点研发计划项目(2019C03133);浙江省引进培育领军型创新创业团队项目(2018R01005);阿里巴巴-浙江大学前沿技术联合研究中心项目;网络空间国际治理研究基地项目。

摘　　要：目前针对说话人识别的攻击需要对音频注入长时间的扰动,因此容易被机器或者管理人员发现.提出了一种新颖的基于单“音频像素”扰动的针对说话人识别的隐蔽攻击.该攻击利用了差分进化算法不依赖于模型的黑盒特性和不依赖梯度信息的搜索模式,克服了已有攻击中扰动时长无法被约束的问题,实现了使用单“音频像素”扰动的有效攻击.特别地,设计了一种基于音频段音频点扰动值多元组的候选点构造模式,针对音频数据的时序特性,解决了在攻击方案中差分进化算法的候选点难以被描述的问题.攻击在LibriSpeech数据集上针对60个人的实验表明这一攻击能达到100%的成功率.还开展了大量的实验探究不同条件(如性别、数据集、说话人识别方法等)对于隐蔽攻击性能的影响.上述实验的结果为进行有效地攻击提供了指导.同时,提出了分别基于去噪器、重建算法和语音压缩的防御思路.Attacks towards the speaker recognition system need to inject a long-time perturbation,so it is easy to be detected by machines or administrators.We propose a novel attack towards the speaker recognition based on one-“audio pixel”.Such attack uses the black-box characteristics and search mode of the differential evolution algorithm that does not rely on the model and the gradient information.It overcomes the problem in previous works that the disturbance duration cannot be constrained.Thus,our attack effectively spoofs the speaker recognition via one-“audio pixel”perturbation.In particular,we design a candidate point construction model based on the audio-point-disturbance tuple targeting time series of audio data.It solves the problem that candidate points of differential evolution algorithm are difficult to be described against our attack.The success rate of our attack achieves 100%targeting 60 people in LibriSpeech dataset.In addition,we also conduct abundant experiments to explore the impact of different conditions(e.g.,gender,dataset and speaker recognition method)on the performance of our stealthy attack.The result of above experiments provides guidance for effective attacks.At the same time,we put forward ideas based on denoising,reconstruction algorithm and speech compression to defend against our stealthy attack,respectively.

关键词：单“音频像素”扰动黑盒攻击说话人识别差分进化算法扰动攻击

分类号：TP309[自动化与计算机技术—计算机系统结构]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于单“音频像素”扰动的说话人识别隐蔽攻击被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于单“音频像素”扰动的说话人识别隐蔽攻击 被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于单“音频像素”扰动的说话人识别隐蔽攻击被引量：3