基于子带谐波一致性的语音转换反取证框架研究

Voice Conversion Anti-forensic Framework Based on Subband Harmonic Consistency

作　　者：甘子健叶登攀[1] 张健[2] GAN Zijian;YE Dengpan;ZHANG Jian(Key Laboratory of Aerospace Information Security and Trusted Computing,Ministry of Education,School of Cyber Science and Engineering,Wuhan University,Wuhan 430072,China;Hunan Province Financial Currency Identification and Self-service Platform Engineering Technology Research Center,School of Computer Science,Central South University,Changsha 410083,China)

机构地区：[1]武汉大学国家网络安全学院空天信息安全与可信计算教育部重点实验室,武汉430072 [2]中南大学计算机学院湖南省金融货币识别与自助服务平台工程技术研究中心,长沙410083

出　　处：《小型微型计算机系统》2024年第8期1960-1965,共6页Journal of Chinese Computer Systems

基　　金：国家自然科学基金面上项目(62272485)资助.

摘　　要：语音转换任务指的是在保持语言内容不变的情况下,将一个说话者的声音身份转换为另一个说话者.然而现有工作很少考虑针对音频取证机器分类模型进行抗检测研究,转换音频极容易被取证模型所识别.本文提出了一种具有3个子带频谱鉴别器设计的语音转换反取证框架HADV-GAN,其合成音频在具有高保真度的前提下,对语音欺骗取证模型具有反取证能力.此外,HADV-GAN无需训练额外的声码器,可以直接以原始音频波形作为输入,并以声学特征重建语音,因此可以避免使用声码器所导致的特征不匹配问题.实验结果表明,本文所提出的方法在3种主流的语音欺骗取证模型LFCC-GMM、MCG-Res2Net以及AASIST上,对比基线模型NVC-Net,在合成音频质量相当的条件下,拥有更好的反取证能力.The voice conversion task refers to the conversion of one speaker′s voice identity to another speaker while keeping the linguistic content unchanged.However,current work rarely considers anti-forensic studies for audio forensic machine classification models,where the converted audio is easily recognized by forensic models.In this paper,we propose a voice conversion anti-forensic framework HADV-GAN with three sub-band spectral discriminator,whose synthesized audio is anti-forensic to voice spoofing forensic models with high fidelity.In addition,HADV-GAN does not need to train additional vocoders,and can directly use the original audio waveform as input and reconstruct the speech with acoustic features,thus avoiding the feature mismatch problem caused by using vocoders.The experimental results show that the proposed method has better anti-forensic capability on the three mainstream voice spoofing forensic models LFCC-GMM,MCG-Res2Net and AASIST than the baseline model NVC-Net with comparable synthesized audio quality.

关键词：语音转换语音欺骗取证子带频谱音频反取证

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于子带谐波一致性的语音转换反取证框架研究

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于子带谐波一致性的语音转换反取证框架研究

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索