基于图神经网络与多特征融合的说话人验证模型  被引量:1

Speaker verification based on graph neural networks and multi-feature fusion

在线阅读下载全文

作  者:曹嘉玲 陈宁[1] Cao Jialing;Chen Ning(School of Information Science&Engineering,East China University of Science&Technology,Shanghai 200237,China)

机构地区:[1]华东理工大学信息科学与工程学院,上海200237

出  处:《计算机应用研究》2023年第12期3678-3682,3689,共6页Application Research of Computers

基  金:国家自然科学基金资助项目(61771196)。

摘  要:近期研究表明,基于大量无标签语音样本训练的预训练模型所提取的特征在说话人验证(SV)任务中表现突出。然而,现有模型尚无法利用帧级特征间的拓扑结构特性对帧级特征进行有效的优化和聚合,并且网络复杂度较高不利于实现实时性;同时,现有模型尚无法充分利用多种输入特征之间的互补性以进一步提升模型的性能。为此,一方面引入图神经网络,利用帧级特征间的拓扑结构特性对帧级特征进行优化;另一方面,构造基于多损失的多特征融合机制以充分利用不同特征之间的互补性进一步提升模型的性能。在VoxCeleb上的实验结果表明,与现有模型相比,该模型GACNPF实现了更低的错误率和时间复杂度;更重要的是,该模型具有很好的灵活性,能够融合任意多种特征,且可被应用于其他基于预训练特征提取的分类任务。Recent research shows that features extracted from pre-trained models trained on large unlabeled speech samples have excelled in SV tasks.However,the existing models can not effectively optimize and aggregate frame-level features by using the topological structure characteristics between frame-level features,and the high network complexity is not conducive to real-time performance.At the same time,the existing models can not make full use of complementarity between multiple input features to further improve the performance of the model.To this end,on the one hand,this paper introduced graph neural networks to optimize frame-level features by using the topological structure between frame-level features.On the other hand,it constructed a multi-feature fusion mechanism based on multiple losses to make full use of the complementarity between different features to further improve the performance of the model.Experimental results on VoxCeleb show that the proposed model GACNPF achieves lower error rates and time complexity compared to existing models.More importantly,the model has good flexibility.It can fuse any kind of features,and it can apply to other classification tasks based on pre-trained feature extraction.

关 键 词:说话人验证 图神经网络 预训练 特征融合 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象