机构地区:[1]水利部水利大数据重点实验室(河海大学),江苏南京211100 [2]河海大学计算机与信息学院,江苏南京211100
出 处:《软件学报》2022年第5期1569-1586,共18页Journal of Software
基 金:国家自然科学基金(U21B2016,61702159);江苏省自然科学基金(BK20191297);中央高校基本科研业务费。
摘 要:随着深度学习技术的成熟,智能语音识别软件获得了广泛的应用,存在于智能软件内部的各种深度神经网络发挥了关键性的作用.然而,最近的研究表明:含有微小扰动的对抗样本会对深度神经网络的安全性和鲁棒性构成极大威胁.研究人员通常将生成的对抗样本作为测试用例输入到智能语音识别软件中,观察对抗样本是否会让软件产生错误判断,从而采取防御方法来提高智能软件安全性和鲁棒性.在对抗样本的生成中,黑盒智能语音软件在生活中较为常见,具有实际的研究价值,而现有的生成方法却存在一定的局限性.为此,针对黑盒智能语音软件,提出了一种基于萤火虫算法和梯度评估方法的目标对抗样本生成方法,即萤火虫-梯度对抗样本生成方法.针对设定的目标文本,在原始的音频样本中不断加入干扰,根据当前对抗样本的文本内容与目标文本之间的编辑距离,选择使用萤火虫算法或梯度评估方法来优化对抗样本,最终生成目标对抗样本.为了验证方法的效果,在常用的语音识别软件上,使用公共语音数据集、谷歌命令数据集和LibriSpeech数据集这3种不同类型的语音数据集进行了实验评估,并寻找志愿者进行对抗样本的质量评估.实验表明,提出的方法能有效提高目标对抗样本生成的成功率,例如针对DeepSpeech语音识别软件,在公共语音数据集上生成对抗样本的成功率相比对比方法提升了13%.With the maturity of deep learning technology,intelligent speech recognition software has been widely used.Various deep neural networks in the intelligent software play a crucial role.Recent studies have shown that minor disturbances in adversarial examples significantly threaten the security and robustness of deep neural networks.Researchers usually take the generated adversarial examples as the test cases and input them into the intelligent speech recognition software to test whether the adversarial examples will make the software misjudge.And then defense methods are adopted to improve the security and robustness of intelligent software.For the adversarial example generation,black box intelligent speech software is more common in life and has practical research value.However,the existing generation methods have some limitations.Therefore,this study proposes a target adversarial example generation method for the black box speech software based on the firefly algorithm and gradient evaluation method,namely the firefly-gradient adversarial example generation method.With the set target text,disturbances are added to the original speech example.The firefly algorithm or gradient evaluation method is chosen to optimize the adversarial example according to the edit distance between the text of the current generated adversarial example and the target text so that the target adversarial example is generated finally.To verify the effectiveness of the method,this study conducts an experimental evaluation on common speech recognition software,using three different types of speech datasets:Common Speech dataset,Google Command dataset and LibriSpeech dataset,and looks for volunteers to evaluate the generated adversarial examples.Experimental results show that the proposed method can effectively improve the success rate of target adversarial example generation.For example,for the DeepSpeech speech recognition software,the success rate of generating adversarial examples on Common Speech datasets is 13%higher than that of the c
关 键 词:智能软件 语音识别 对抗样本 萤火虫算法 梯度评估方法
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...