检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张晓辉 易江燕[1] 陶建华 周俊佐 Zhang Xiaohui;Yi Jiangyan;Tao Jianhua;Zhou Junzuo(Institute of Automation,Chinese Academy of Sciences,Beijing 100190;School of Computer and Information Technology,Beijing Jiaotong University,Beijing 100044;Department of Automation,Tsinghua University,Beijing 100084)
机构地区:[1]中国科学院自动化研究所,北京100190 [2]北京交通大学计算机与信息技术学院,北京100044 [3]清华大学自动化系,北京100084
出 处:《计算机研究与发展》2025年第2期336-345,共10页Journal of Computer Research and Development
基 金:国家重点研发计划项目(2020AAA0140003);国家自然科学基金项目(61831022,U21B2010,61901473,62006223,2101553)。
摘 要:目前,深度学习在合成语音检测领域取得了巨大的成功.然而,通常情况下,深度模型可以在与训练集分布相似的测试集上取得高准确率,但在跨数据集的情境下,其准确率却会显著下降.为了提高模型在新数据集上的泛化能力,通常需要对其进行微调,但这会导致模型遭受灾难性遗忘.灾难性遗忘指的是模型在新数据上的训练会损害其从旧数据中获得的知识,导致对旧数据的识别性能下降.目前,克服灾难性遗忘的主要算法之一是连续学习.面向合成语音检测提出了一种连续学习算法——弹性正交权重修正(elastic orthogonal weight modification,EOWM),用于克服灾难性遗忘.该算法通过修正模型在学习新知识时的参数更新方向和更新幅度,以减少对已学知识的损害.具体来说,该算法在模型学习新知识时要求参数的更新方向与旧任务的数据分布正交,并同时限制对旧任务中重要参数的更新幅度.在合成语音检测领域的跨数据集实验中,算法取得了良好的效果.与微调相比,该算法在旧数据集上的等错误率(equal error rate,EER)从7.334%降低至0.821%,相对下降了90%;在新数据集上的等错误率从0.513%降低至0.315%,相对下降了40%.Currently,deep learning has achieved significant success in the field of synthetic speech detection.However,deep models commonly attain high accuracy on test sets that closely match their training distribution but exhibit a substantial drop in accuracy in cross-dataset scenarios.To enhance the generalization capability of models on new datasets,they are often fine-tuned with new data,but this leads to catastrophic forgetting,where the model’s knowledge learned from old data is impaired,resulting in deteriorated performance on the old data.Continuous learning is a prevalent approach to mitigate catastrophic forgetting.In this paper,we propose a continuous learning algorithm called elastic orthogonal weight modification(EOWM)to address catastrophic forgetting for synthetic speech detection.EOWM mitigates knowledge degradation by adjusting the direction and magnitude of parameter updates when the model learns new knowledge.Specifically,it enforces the updates’direction to be orthogonal to the data distribution of the old tasks while constraining the magnitude of updates for important parameters in the old tasks.Our proposed algorithm demonstrates promising results in cross-dataset experiments within the domain of synthetic speech detection.Compared with fine-tuning,EOWM reduces the equal error rate(EER)on the old dataset from 7.334%to 0.821%,representing a relative improvement of 90%,and on the new dataset,it decreases EER from 0.513%to 0.315%,corresponding to a relative improvement of 40%.
关 键 词:合成语音检测 连续学习 弹性正交权重修正 预训练模型 深度神经网络
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.63