检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李磊[1,2] 朱永同 杨琦[1,2,3] 赵金葳 马柯 LI Lei;ZHU Yongtong;YANG Qi;ZHAO Jinwei;MA Ke(School of Health Science and Engineering,University of Shanghai for Science and Technology,Shanghai 200093,China;Institute of Machine Intelligence,University of Shanghai for Science and Technology,Shanghai 200093,China;School of Mechanical Engineering,University of Shanghai for Science and Technology,Shanghai 200093,China;School of Mechanical and Electrical Information,Shangqiu University,Shangqiu Henan 476000,China)
机构地区:[1]上海理工大学健康科学与工程学院,上海200093 [2]上海理工大学机器智能研究院,上海200093 [3]上海理工大学机械工程学院,上海200093 [4]商丘学院机械与电气信息学院,河南商丘476000
出 处:《智能计算机与应用》2024年第1期85-94,101,共11页Intelligent Computer and Applications
摘 要:传统音频分类任务仅仅是从单层次音频提取特征向量进行分类,即便使用过大的模型,其过多的参数也会造成特征之间的耦合,不符合特征提取“高聚类,低耦合”的原则。由于注意到一些与情绪相关的协变量并没有得到充分利用,本文在模型中加入性别先验知识;将多层次音频特征分类问题转化为多任务问题进行处理,从而对多层次特征进行解耦再进行分类;针对特征分布的再优化方面设计了一个中心损失模块。通过在IEMOCAP数据集上的实验结果表明,本文提出模型的加权精度(WA)和未加权精度(UA)分别达到了71.94%和73.37%,与原本的多层次模型相比,WA和UA分别提升了1.38%和2.35%。此外,还根据Nlinear和Dlinear算法设计了两个单层次音频特征提取器,在单层次音频特征分类实验中取得了较好的结果。The traditional audio classification tasks involve extracting feature vectors from single-level audio,which can result in coupling between features even when using large models due to excessive parameters,violating the principle of high cohesion and low coupling in feature extraction.We observe that some emotion-related covariates are not fully utilized.Therefore,we incorporate gender-prior knowledge into the model.Furthermore,we transform the multilevel audio feature classification problem into a multi-task problem,thereby decoupling and classifying multilevel features separately.Finally,we introduce center loss for further optimization of feature distribution.Experimental results on the IEMOCAP dataset demonstrate that the proposed model achieves a weighted accuracy(WA)of 71.94%and an unweighted accuracy(UA)of 73.37%,which are improved by 1.38%and 2.35%respectively compared to the original multilevel model.In addition,we have designed two single-level audio feature extractors based on the Nlinear and Dlinear algorithms,which have yielded promising results in single-level audio feature classification experiments.
关 键 词:语音情感分类 MFCC 中心损失 多任务学习 先验信息 Dlinear
分 类 号:TP241[自动化与计算机技术—检测技术与自动化装置]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.13