检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:陈思竹 龙华[1] 邵玉斌[1] CHEN Si-zhu;LONG Hua;SHAO Yu-bin(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China;Radio Monitoring Center of Yunnan Province,Kunming 650228,China)
机构地区:[1]昆明理工大学信息工程与自动化学院,云南昆明650500 [2]云南省无线电监测中心,云南昆明650228
出 处:《中国电子科学研究院学报》2023年第12期1138-1145,共8页Journal of China Academy of Electronics and Information Technology
基 金:云南省媒体融合重点实验室开放基金资助项目(320225403)。
摘 要:深度学习方法在图像识别领域得到大量研究和应用,也逐渐被应用于语种识别。针对深度学习语种识别模型中所用二维特征图语种间相似度大,容易混淆的问题,提出基于反事实注意力学习的ResNeSt语种识别模型。在建立云南边境语种广播语音数据集的基础上,首先,提取MFCC、Fbank和语谱图作为FcaNet、ResNet和ResNeSt三种网络的输入,对比三种网络下不同信噪比不同语音特征的识别效果,得出在语种识别任务中综合表现最佳的网络模型ResNeSt和语音特征Fbank;接着,在识别效果最佳的ResNeSt网络模型中引入反事实注意力学习模块,利用反事实因果关系来衡量ResNeSt网络中注意力特征的质量,促使网络学习更加有效的注意力特征,以此提高网络训练效果。实验结果表明,加入反事实注意力学习后,Fbank特征语种识别率较基线系统提升1.61%,对于MFCC、Fbank和语谱图三种特征,基于反事实注意力学习的ResNeSt网络较基线ResNeSt网络平均提升1.33%。反事实注意力学习帮助注意力机制关注更多重要语种区分性信息,有效提升了网络模型在语种识别任务中的识别效果。Deep learning methods have received extensive research and application in the field of image recognition,and are gradually being applied in the field of language recognition.Aiming at the problem that the two-dimensional feature map used in the deep learning language recognition model has a large similarity between languages and is easy to be confused,a ResNeSt language recognition model based on counterfactual attention learning is proposed.On the basis of establishing a voice dataset for Yunnan border language broadcasting,MFCC,Fbank,and spectrogram are first extracted as inputs for FcaNet,ResNet,and ResNeSt networks.The recognition effects of different signal-to-noise ratios and speech features under the three networks are compared,and the network model ResNeSt and speech feature Fbank that perform best in language recognition tasks are obtained.Next,a counterfactual attention learning module is introduced into the ResNeSt network model with the best recognition performance,using counterfactual causality to measure the quality of attention features in the ResNeSt network,promoting the network to learn more effective attention features and thereby improving network training effectiveness.The experimental results showed that after adding counterfactual attention learning,the recognition rate of Fbank feature languages increased by 1.61%compared to the baseline system.For MFCC,Fbank,and spectrogram features,the ResNeSt network based on counterfactual attention learning increased by an average of 1.33%compared to the baseline ResNeSt network.Counterfactual attention learning helps attention mechanisms focus on more important language discriminative information,effectively improving the recognition performance of network models in language recognition tasks.
关 键 词:语种识别 反事实注意力学习 ResNeSt 语音特征
分 类 号:TN912.34[电子电信—通信与信息系统]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.140.198.85