基于元学习的半监督声音事件检测方法

Semi-supervised Sound Event Detection Based on Meta Learning

作　　者：沈雅馨高利剑毛启容[1,2] SHEN Yaxin;GAO Lijian;MAO Qirong(College of Computer Science and Communication Engineering,Jiangsu University,Zhenjiang,Jiangsu 212013,China;Jiangsu Province Big Data Ubiquitous Perception and Intelligent Agriculture Application Engineering Research Center,Zhenjiang,Jiangsu 212013,China)

机构地区：[1]江苏大学计算机科学与通信工程学院,江苏镇江212013 [2]江苏省大数据泛在感知与智能农业应用工程研究中心,江苏镇江212013

出　　处：《计算机科学》2025年第3期222-230,共9页Computer Science

基　　金：国家自然科学基金(62176106);江苏省研究生科研与实践创新计划项目(KYCX22_3668);江苏大学应急管理学院专项科研项目(KY-A-01)。

摘　　要：现有的半监督声音事件检测方法直接使用强标签合成样本、弱标签真实样本和无标签真实样本进行训练,以缓解标签样本量不足的问题。然而,合成和真实数据域之间存在不可避免的分布差异,这种差异会干扰模型梯度优化方向,从而限制模型的泛化能力。针对这一问题,基于元学习(Meta Learning)提出了一种新颖的半监督声音事件检测学习范式MMT(Meta Mean Teacher)。具体来说,对于每个训练批次的数据,将其分为由合成样本组成的元训练集和由真实样本组成的元测试集;将模型在元训练集上计算的元梯度作为元测试梯度更新的指导,使模型感知并学习到更具泛化性的知识。在DCASE2021任务4数据集的测试集上进行对比实验,结果表明,相较于官方基线,所提出的学习范式MMT在F1,PSDS1和PSDS2指标上分别提升了8.9%,6.6%和1.1%;相较于当前的先进方法,所提出的学习范式MMT同样表现出了显著的性能优势。Existing semi-supervised sound event detection methods directly utilize strongly labeled synthetic samples,weakly labeled real samples,and unlabeled real samples for training to alleviate the issue of insufficient labeled samples.However,there is an inevitable distribution gap between synthetic and real domains,which can interfere with the direction of model gradient optimization,thereby restricting generalization ability of these models.To address this challenge,a novel semi-supervised sound event detection learning paradigm,meta mean teacher(MMT),is proposed based on meta-learning.Specifically,for each batch of trai-ning data,it is divided into a meta-training set consisting of synthetic samples and a meta-test set consisting of real samples.The meta-gradient calculated on the meta-training set serves as guidance for updating the meta-test gradient,allowing the model to perceive and learn more generalized knowledge.Experimental results on the DCASE2021 Task 4 dataset show that,compared to the official baseline,the proposed learning paradigm MMT has a relative improvement of 8.9%,6.6%,and 1.1%in the F1,PSDS1,and PSDS2 metrics,respectively.Compared to the current state-of-the-art methods in the field,the proposed learning paradigm MMT still demonstrates a significant performance advantage.

关键词：声音事件检测元学习一致性正则化半监督学习深度学习

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于元学习的半监督声音事件检测方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于元学习的半监督声音事件检测方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索