检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:顾龙昊 黄连丽[1] 周奎 张子越 GU Longhao;HUANG Lianli;ZHOU Kui;ZHANG Ziyue(Department of Electrical and Information Engineering,Hubei University of Automotive Technology;Sharing-X Key Joint Laboratory,Institute of Automotive Engineers,Hubei University of Automotive Technology,Shiyan 442002,China)
机构地区:[1]湖北汽车工业学院电气与信息工程学院 [2]湖北汽车工业学院汽车工程师学院Sharing-X重点联合实验室,湖北十堰442002
出 处:《软件导刊》2024年第9期76-81,共6页Software Guide
基 金:湖北省重点研发计划项目(2023BAB169);湖北省武汉市科技重大专项(2022013702025184)。
摘 要:为解决低资源条件下由于训练数据不足导致识别精度降低、泛化能力较差的问题,提出一种语音识别方法。该方法利用卷积池化提取特征信息,将Attention机制与DTDNN融合成为ADTDNN,以提升低资源环境下模型捕捉序列中关键信息的能力;采用链接时序分类简化模型的识别流程;使用Transformer作为语言模型。在Aishell-1数据集上的实验结果表明,低资源环境下基于ADTDNN的语音识别模型与LAS、Transformer等主流端到端模型相比,字错误率分别降低了3.7%和1.0%。A speech recognition approach has been proposed to address the problem of reduced recognition accuracy and poorer generalization performance due to insufficient training data in low-resource conditions.This method leverages convolutional neural networks to extract feature information.It combines the attention mechanism with delayed time-delay neural networks,referred to as ADTDNN,enhancing the model′s ability to capture key information in sequences within low-resource environments.The approach employs linking temporal classification to streamline the recognition process of the model.Additionally,a Transformer is utilized as the language model.Experimental results on the Aishell-1 dataset demonstrate that the ADTDNN-based speech recognition model in low-resource settings reduces word error rates by 3.7%and 1%compared to mainstream end-to-end models like LAS and Transformer,respectively.
关 键 词:语音识别 时延神经网络 TRANSFORMER 数据增强 低资源
分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.30