融合分布对齐和对抗学习的无监督跨域声纹识别  被引量:1

Unsupervised cross-domain speaker recognition based on distribution alignment and adversarial learning

在线阅读下载全文

作  者:陈志高 赵庆卫[1] 王丽 王文超 CHEN Zhigao;ZHAO Qingwei;WANG Li;WANG Wenchao(Key Laboratory of Speech Acoustics and Content Understanding,Institute of Acoustics,Chinese Academy of Sciences Beijing 100190;University of Chinese Academy of Sciences,Beijing 100049)

机构地区:[1]中国科学院声学研究所语言声学与内容理解重点实验室,北京100190 [2]中国科学院大学,北京100049

出  处:《声学学报》2021年第5期767-774,共8页Acta Acustica

基  金:国家自然科学基金项目(11590774,11590772,11590770)资助。

摘  要:针对声纹识别领域不匹配,且目标领域缺少标注数据的难题,提出在对抗学习基础上融合分布对齐的无监督领域自适应方法,通过训练过程中统计分布的对齐,以减小领域差异,从而提取声音中更有声纹鉴别性的特征,取得了稳定的性能提升。在文本相关的声纹识别任务中,对抗学习和分布对齐的方法能协同发挥作用,等错率相对降低11%;在文本无关的任务中,对抗学习效果不稳定,而分布对齐的方法依然有相对8%的性能提升。实验结果证明该方法在领域不匹配且目标领域缺少标注数据时,能有效提取语音中声纹鉴别信息,稳定提升识别性能。Domain mismatch has become one of the biggest challenges for realistic speaker recognition systems,especially labeled data in the target domain are unavailable.The proposed methods fuse with adversarial learning to extract speaker discriminative features.It reduces domain discrepancy by distribution alignment during the training stage.Consistent performance improvements are achieved under variety of domain mismatch circumstances.For text-dependent tasks,adversarial learning and distribution alignment work together to reduce the equal error rates 11%relatively.As for text-independent tasks,adversarial learning can hardly make contributions while our distribution alignment still achieves a relative 8%improvement.The proposed methods can steadily improve the performance effectively for unsupervised cross-domain speaker recognition.

关 键 词:领域自适应 声纹识别 标注数据 有效提取 文本无关 对齐 鉴别信息 鉴别性 

分 类 号:TN912.3[电子电信—通信与信息系统] TP18[电子电信—信息与通信工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象