检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王世刚 严瑾 WANG Shigang;YAN Jin(School of Automation,Guangxi University of Science and Technology,Liuzhou 545616,China)
机构地区:[1]广西科技大学自动化学院,广西柳州545616
出 处:《电声技术》2023年第12期111-114,共4页Audio Engineering
摘 要:深度前馈序列记忆网络(Deep Feedforward Sequential Memory Network,DFSMN)是一种识别准确率较高且在非特定人语音识别领域得到良好应用的声学模型,但存在参数冗余、模型训练困难的情况。针对此问题,提出一种基于改进DFSMN的非特定人语音识别模型。该模型改进了DFSMN记忆模块大小和模块之间的连接方式,并结合连接时序分类(Connectionist Temporal Classification,CTC)端到端语音识别框架。实验结果表明,在相同条件下,该改进模型的参数量较之前减少了约1/10,在不同数据集上与几种常见语音识别模型相比,其语音识别字符错误率均最低,在识别准确率和模型训练效率方面具有一定的优越性。Deep Feedforward Sequential Memory Network(DFSMN) is an acoustic model with high recognition accuracy and has been well applied in the field of non-specific speech recognition.However,this model suffers from parameter redundancy and difficulty in training.In response to this issue,this article proposes a non-specific person speech recognition model based on improved DFSMN.It improves the DFSMN memory unit structure and the connection between units.Meanwhile,it combines with the Connection Temporal Classification(CTC) end-to-end speech recognition framework.The experimental results show that under the same conditions,the number of parameters of the improved model has decreased by about 1/10 compared to before.At the same time,compared with several common speech recognition models on different datasets,its speech recognition word error rate is the lowest.It has certain advantages in recognition accuracy and model training efficiency.
关 键 词:语音识别 深度前馈序列记忆网络(DFSMN) 非特定人 连接时序分类(CTC)
分 类 号:TN912.34[电子电信—通信与信息系统]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.142.243.141