ACGFN:基于非对称卷积和门控前馈神经网络的语音识别模型

ACGFN:A Speech Recognition Model Based on Asymmetric Convolution and Gated Feedforward Neural Network

作　　者：王詠森刘倩刘立波 WANG Yongsen;LIU Qian;LIU Libo(College of Information Engineering,Ningxia University,Yinchuan,Ningxia 750021,China)

机构地区：[1]宁夏大学信息工程学院,宁夏银川750021

出　　处：《中文信息学报》2025年第1期167-174,共8页Journal of Chinese Information Processing

基　　金：宁夏回族自治区重点研发计划项目(2022BEG03073);国家自然科学基金(62262053);宁夏科技创新领军人才项目(2022GKLRLX03)。

摘　　要：针对现有基于Conformer语音识别模型对时频特征提取能力不足、模型结构冗余和参数量较大的问题,该文提出一个基于非对称卷积和门控前馈神经网络的语音识别模型ACGFN。首先,采用不同感受野大小的非对称卷积对语音序列的时频特征进行多尺度融合下采样,在增强模型提取时频特征的能力的同时,有效降低了下采样过程中信息的损失;其次,引入门控前馈模块替换Conformer中的双半步前馈网络,降低网络参数量的同时精简了模型结构。实验结果表明,该方法在公共数据集AISHELL-1和aidatatang_200zh的测试集上字错误率分别为4.48%、4.28%,且参数量仅40.3M。相较对比方法,识别字错误率和参数量均有所降低。To address the insufficient ability of time-frequency feature extraction,redundant model structure and large number of parameters in existing Conformer speech recognition models,this paper proposes a speech recognition model based on asymmetric convolution and gated feedforward neural network(ACGFN).Firstly,the model employs asymmetric convolutions with different receptive field sizes to perform multi-scale fusion and downsampling of the time-frequency features in speech sequences,which effectively reduces information loss during the downsampling process while enhancing the capability to extract time-frequency features.Secondly,the gated feedforward module is introduced to replace the double half-step feedforward network in Conformer,reducing the number of network parameters and simplifying the model structure.Experimental results show that compared with other algorithms,the proposed method outperforms the baselines by achieving 4.48%and 4.28%character error rate(CER)on the public datasets AISHELL-1 and aidatatang_200zh,respectively,with only 40.3M parameters.

关键词：语音识别端到端 CONFORMER

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

ACGFN:基于非对称卷积和门控前馈神经网络的语音识别模型

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

ACGFN:基于非对称卷积和门控前馈神经网络的语音识别模型

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索