检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:迪力扎提·伊力哈木 米吉提·阿不里米提[1] 郑方 艾斯卡尔·艾木都拉[1] ELHAM Dlzat;ABLIMIT Mijit;ZHENG Fang;HAMDULLA Askar(College of Information Science and Engineering,Xinjiang University,Urumqi 830046,China;School of Information Science and Technology,Tsinghua University,Beijing 100084,China)
机构地区:[1]新疆大学信息科学与工程学院,新疆乌鲁木齐830046 [2]清华大学信息科学技术学院,北京100084
出 处:《现代电子技术》2022年第24期37-43,共7页Modern Electronics Technique
基 金:国家重点研发计划(2017YFC0820602)。
摘 要:针对现有语种识别方法对跨信道环境下关注较少的问题进行研究,在实际应用场景中语音采集设备与传输信道差异使得语种识别性能急剧下降。为降低跨信道对识别性能的影响,文中提出一种基于注意力机制的BiLSTM语种识别方法,在特征提取阶段对比MFCC、FBANK、LPCC等不同语音特征的识别效果。实验证明FBANK特征在跨信道环境下的识别效果更好,引入注意力机制能够关注跨信道语音片段中与语种相关的信息,忽略非语种信息。所提方法在东方语种识别竞赛两个跨信道数据集(AP19⁃OLR和AP20⁃OLR)上进行实验,通过与基线系统X⁃VECTOR等语种识别方法进行对比,得出所提方法在两个数据集上的等错误率(EER)降低3.48%和5.66%。实验结果表明,基于注意力机制的BiLSTM语种识别方法能够有效提高语种识别任务中的特征提取能力,并改善跨信道环境下的语种识别性能。The existing language recognition methods pay less attention to the language recognition in cross⁃channel environment.In the actual application scenario,the difference between speech acquisition equipment and transmission channel makes the language recognition performance decline sharply.A BiLSTM language recognition method based on attention mechanism is proposed to reduce the impact of cross⁃channel on the recognition performance.In the feature extraction stage,the recognition effects of different speech features such as MFCC,FBANK and LPCC are compared.The experimental results show that FBANK feature has better recognition effect in cross⁃channel environment.The introduction of attention mechanism can focus on the information relating to language in cross channel speech fragments,and ignore the non⁃language information.The proposed method is tested on two cross⁃channel data sets(AP19⁃OLR)and(AP20⁃OLR)of the Oriental Language Recognition competition.By comparing with the language recognition methods such as baseline system X⁃VECTOR,the equal error rate(EER)of the proposed method on the two data sets are decreased by 3.48%and 5.66%.The experimental results show that the BiLSTM language recognition method based on attention mechanism can effectively improve the feature extraction ability in language recognition task,and improve the performance of language recognition in cross⁃channel environment.
关 键 词:语种识别 跨信道 特征提取 注意力机制 识别方法对比 BiLSTM模型
分 类 号:TN911.23-34[电子电信—通信与信息系统]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.171