检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:黄天圆 王超 HUANG Tianyuan;WANG Chao(School of Information and Electrical Engineering,Hebei University of Engineering,Handan 056038,Hebei,China)
机构地区:[1]河北工程大学信息与电气工程学院,河北邯郸056038
出 处:《智能计算机与应用》2025年第2期162-167,共6页Intelligent Computer and Applications
基 金:河北省自然科学基金面上项目(A2020402013)。
摘 要:Conformer模型是语言处理任务中广泛应用的模型之一,其结合了Transformer模型和卷积神经网络的特点,既能捕捉到局部和全局的序列特征又能更好地理解输入数据的结构和上下文信息。然而,现有Conformer模型中的音频和文本之间对齐关系存在不确定性,同时模型采用的多头注意力还会将未来时间步输入信息泄漏到当前时间步。采用连接时序分类(Connectionist Temporal Classification, CTC)机制进行辅助训练,不仅可以提高基于Macaron-Net结构的Conformer模型鲁棒性,还可以解决音频和文本不对齐问题。在解码器部分,应用遮蔽多头自注意力机制以确保在t时刻模型无法查看未来时间步的输入信息,从而保证模型仅利用已生成的标记进行预测。实验结果表明,基于遮蔽多头注意力的CTC-Conformer模型相对于Conformer模型的字错率与损失率均有所下降,损失值最低达到了3.24。Conformer is one of the most widely used models for language processing tasks.It combines the features of Transformer and convolutional neural network,it can not only capture local and global sequence features,but also better understand the structure and context information of input data.On the one hand,in the current Conformer model,it is uncertain in the alignment between audio and text.On the other hand,the multi-attention will leak the input information of the future time step to the current time step.To solve the above problems,the connectionist temporal classification(CTC)is used to improve the robustness of the Conformer model based on Macaron-Net structure,and resolve the issue of audio and text misalignment.Furthermore,masking multi-head self-attention mechanism is applied,in the decoder part,to ensure that the model can not view the input information of future time step at T-moment,so that the model can only make predictions with the generated markers.The results show that both the word error rate and the loss rate of CTC-Conformer model based on masking multi-head attention are lower than that of Conformer model,the lowest loss rate is 3.24.
关 键 词:CONFORMER CTC 遮蔽多头注意力 语言处理
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49