基于CNN-BGRU的音素识别研究被引量：1

Research of phoneme recognition based on the CNN-BGRU model

作　　者：和丽华江涛[1] 潘文林[1] 杨皓然 HE Li-hua;YANG Hao-ran;JIANG Tao;PAN Wen-lin(School of Mathematics and Computer Science,Yunnan Minzu University,Kunming 650500,China)

机构地区：[1]云南民族大学数学与计算机科学学院,云南昆明650500

出　　处：《云南民族大学学报（自然科学版）》2020年第5期493-500,共8页Journal of Yunnan Minzu University:Natural Sciences Edition

基　　金：国家自然科学基金(61363022)。

摘　　要：音素是一个语言体系中最小的语音单位,音素识别在大词汇语音识别任务中不受词汇和语句的限制.因此,选择音素作为识别单元,建立基于CNN-BGRU的神经网络模型,实现音素语谱图的分类.首先,使用短时傅里叶变换生成音素语谱图作为模型的输入;其次建立CNN-BGRU模型,利用改进的VGGNet模型提取音素语谱图的特征,再使用双向门控循环单元(BGRU)实现音素语谱图的序列信息表示;最后,通过Softmax分类器实现音素语谱图的分类.实验使用TIMIT英语语音数据集进行音素语谱图识别,准确率达到98.6%,优于CNN(VGG16)、CNN-RNN、CNN-BRNN、CNN-BLSTM这4个模型.Phoneme is the smallest phonetic unit in a language system.Phoneme recognition is not restricted by words and sentences in the task of large vocabulary speech recognition.Therefore,in this paper,phoneme is selected as the recognition unit,and a neural network model based on CNN-BGRU is established to realize the classification of the phonemic spectrum.Firstly,the short-time Fourier transform is used to generate the phonemic spectrum as the input of the model.Secondly,the CNN-BGRU model is established to extract the features of the phonemic spectrum by using the improved VGGNet model,and then the sequence information representation of the phonemic spectrum is realized by using the two-way gated loop unit(BGRU).Finally,Softmax classifier is used to realize the classification of the phonemic spectrum.In the experiment,TIMIT English speech data set is used for phoneme recognition with an accuracy of 98.6%,which is better than CNN(VGG16),CNN-RNN,CNN-BRNN or CNN-BLSTM.

关键词：音素识别卷积神经网络双向门循环机制

分类号：TN912.34[电子电信—通信与信息系统]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于CNN-BGRU的音素识别研究被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于CNN-BGRU的音素识别研究 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于CNN-BGRU的音素识别研究被引量：1