检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:段文婷[1] DUAN Wenting(Shangluo University,Shangluo Shannxi 726000,China)
机构地区:[1]商洛学院,陕西商洛726000
出 处:《自动化与仪器仪表》2022年第11期210-215,共6页Automation & Instrumentation
基 金:陕西省社科界2021年度重大理论与现实问题研究项目(2021ND0104)。
摘 要:针对传统英语对话机器人的发音检测模型发音错误检测准确性低,导致发音标准性检测效果不佳的问题,提出基于唇部角度融合的多模态端到端模型BiLSTM-CTC。获取英语对话机器人原始对话数据后,分别对音频数据和视频数据进行预处理,获取音视频特征后对其进行归一化和增强,之后利用BiLSTM网络进行特征学习,由Softmax输出序列概率;最后通过CTC算法作为输出层生成预测输出序列。实验结果表明,在无噪音和SNR=10 dB的试验环境下,基于角度特征融合的多模态语音识别方法分别在86次和125次时实现收敛,语音识别率为98.73%和91.15%,在圆展唇音和总体发音标准性检测方面,本方法的检错准确率分别为95.66%、94.86%和92.34%、91.38%,均优于另外两种模型。由此可知,本模型的收敛速度更快,对于音频信号的发音识别率和错误检测率更高,可实现英语对话机器人的发音标准性检测。In view of the problem of the low pronunciation error detection accuracy of the poor pronunciation standard detection effect, the multi-modal end-to-end model BiLSTM-CTC based on lip angle fusion is proposed. After obtaining the original dialogue data of the English dialogue robot, the audio data and video data are preprocessed, and the audio and video features are obtained. Then, the BiLSTM network is used to learn the output sequence probability by Softmax. Finally, the predicted output sequence is generated by the CTC algorithm as the output layer. The experimental results show that in the noiseless and SNR=10 dB, the multimodal speech recognition method based on angular feature fusion converges at 86 and 125 times, respectively, at 98.73% and 91.15%. In addition, the error detection accuracy is 95.66%, 94.86% and 92.34%, which is better than the other two models. Therefore, the convergence rate of this model is faster, and the pronunciation recognition rate and error detection rate of audio signal are higher, and the pronunciation standard detection of English dialogue robot can be realized.
关 键 词:发音检错 BiLSTM-CTC 多模态 特征融合 语音识别
分 类 号:TP392[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.28