检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:孟一凡 陈宁[1] 李泓锴 MENG Yifan;CHEN Ning;LI Hongkai(School of Information Science and Engineering,East China University of Science and Technology,Shanghai 200237,China)
机构地区:[1]华东理工大学信息科学与工程学院,上海200237
出 处:《华东理工大学学报(自然科学版)》2024年第6期898-904,共7页Journal of East China University of Science and Technology
基 金:国家自然科学基金面上项目(61771196)。
摘 要:与其他语种的方言相比,中文方言种类较多,且方言类间差异小,类内差异大,因此中文方言识别极具挑战性。考虑到中文方言间的差异性可能体现在语音的局部(短时)特性上,也可能体现在语音的全局(长时)特性上,同时还可能反映在语音不同层级的特性上,本文提出一种融合语音局部和全局特征提取以及多级特征聚合的中文方言识别模型。首先通过Res2Block提取语音的局部特征,然后利用Conformer提取语音的全局特征,最后通过将多个Conformer级联输出进行多层级特征的聚合。跨域和非跨域的实验结果表明,该模型取得了比基线模型更高的识别准确率。Compared to dialects in other languages,there are a wide variety of dialects with small inter-class differences but large intra-class differences in China.Therefore,Chinese dialect identification poses significant challenges.Considering that the differences between Chinese dialects may manifest in both local(short-term)and global(long-term)speech characteristics,as well as in different hierarchical levels of speech,this paper proposes a Chinese dialect identification model that integrates the extraction of both local and global speech features and the aggregation of multi-level features.Specifically,this paper first extracts the local features of speech using Res2Block,then utilizes Conformer to extract the global features of speech,and finally aggregates multi-level features by cascading the outputs of multiple Conformers.Experimental results on both unseen domain and seen domain settings demonstrate that the proposed model achieves higher recognition accuracy compared to the baseline model.
关 键 词:CONFORMER 方言识别 多层级特征聚合 Res2Block 注意力统计池化
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.3