检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:杨壮 颜永红[2] 黄志华[1] YANG Zhuang;YAN Yonghong;HUANG Zhihua(School of Computer Science and Technology,Key Laboratory of Signal Detection and Processing in Xinjiang,Urumqi 830000,China;Key Laboratory of Speech Acoustics Content Understanding,Institute of Acoustics,Chinese Academy of Sciences,Beijing 100190,China)
机构地区:[1]新疆大学计算机科学与技术学院信号检测与处理实验室,乌鲁木齐830000 [2]中国科学院声学研究所语言声学与内容理解重点实验室,北京100190
出 处:《应用声学》2024年第3期498-504,共7页Journal of Applied Acoustics
基 金:新疆维吾尔自治区自然科学基金面上项目(2022D01C59);科技部重点研发项目(2018YFC0823402)。
摘 要:口音识别是指在同一语种下识别不同的区域口音的过程。为了提高口音识别的准确率,采用了多种方法,取得了明显的效果。首先,为了解决声学特征中关键特征权重不突出的问题,引入了有效的注意力机制,并对多种注意力机制进行了比较和分析。通过模型自适应学习通道和空间维度的不同权重,提高了口音识别的性能。在Common Voice英语口音数据集上的实验结果表明,引入CBAM注意力模块是有效的,识别准确率相对提升了12.7%,精确度相对提升了17.9%,F1值相对提升了6.98%。之后,提出了一种树形分类方法来缓解数据集中的长尾效应,识别准确率最多相对提升了5.2%。受域对抗训练的启发,尝试通过对抗学习方法剔除口音特征中的冗余信息,使得准确率最多相对提升了3.4%,召回率最多相对提升了16.9%。Accent detection refers to the process of identifying different regional accents within the same language class.To enhance the accuracy of accent detection,we employed several methods and then the obvious effect was obtained.Firstly,in order to solve the problem that accent detection features do not highlight the weight of key features,the attention mechanism is introduced,and a variety of attention mechanisms are compared and analyzed.The performance of accent detection is improved through the model adaptive learning channel and different weights of spatial dimensions.The experiment results on the English accent datasets named Common Voice show that the introduction of CBAM attention module is effective,with a relative improvement of 12.7%in accuracy and 17.9%in precision and 6.98%in F1-score parameters.After that,we proposed a Tree-Form based classification method to alleviate the long-tail effect,and the accuracy parameter is improved by 5.2%at most.Inspired by domain adversarial training(DAT),we attempted to eliminate redundant information of accent via adversarial training.The relative improvement of accuracy parameter is up to 3.4%,and the relative improvement of recall parameter is up to 16.9%.
分 类 号:TN912.3[电子电信—通信与信息系统]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222