检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:甘子健 叶登攀[1] 张健[2] GAN Zijian;YE Dengpan;ZHANG Jian(Key Laboratory of Aerospace Information Security and Trusted Computing,Ministry of Education,School of Cyber Science and Engineering,Wuhan University,Wuhan 430072,China;Hunan Province Financial Currency Identification and Self-service Platform Engineering Technology Research Center,School of Computer Science,Central South University,Changsha 410083,China)
机构地区:[1]武汉大学国家网络安全学院空天信息安全与可信计算教育部重点实验室,武汉430072 [2]中南大学计算机学院湖南省金融货币识别与自助服务平台工程技术研究中心,长沙410083
出 处:《小型微型计算机系统》2024年第8期1960-1965,共6页Journal of Chinese Computer Systems
基 金:国家自然科学基金面上项目(62272485)资助.
摘 要:语音转换任务指的是在保持语言内容不变的情况下,将一个说话者的声音身份转换为另一个说话者.然而现有工作很少考虑针对音频取证机器分类模型进行抗检测研究,转换音频极容易被取证模型所识别.本文提出了一种具有3个子带频谱鉴别器设计的语音转换反取证框架HADV-GAN,其合成音频在具有高保真度的前提下,对语音欺骗取证模型具有反取证能力.此外,HADV-GAN无需训练额外的声码器,可以直接以原始音频波形作为输入,并以声学特征重建语音,因此可以避免使用声码器所导致的特征不匹配问题.实验结果表明,本文所提出的方法在3种主流的语音欺骗取证模型LFCC-GMM、MCG-Res2Net以及AASIST上,对比基线模型NVC-Net,在合成音频质量相当的条件下,拥有更好的反取证能力.The voice conversion task refers to the conversion of one speaker′s voice identity to another speaker while keeping the linguistic content unchanged.However,current work rarely considers anti-forensic studies for audio forensic machine classification models,where the converted audio is easily recognized by forensic models.In this paper,we propose a voice conversion anti-forensic framework HADV-GAN with three sub-band spectral discriminator,whose synthesized audio is anti-forensic to voice spoofing forensic models with high fidelity.In addition,HADV-GAN does not need to train additional vocoders,and can directly use the original audio waveform as input and reconstruct the speech with acoustic features,thus avoiding the feature mismatch problem caused by using vocoders.The experimental results show that the proposed method has better anti-forensic capability on the three mainstream voice spoofing forensic models LFCC-GMM,MCG-Res2Net and AASIST than the baseline model NVC-Net with comparable synthesized audio quality.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.188