检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:和丽华 江涛[1] 潘文林[1] 杨皓然 HE Li-hua;YANG Hao-ran;JIANG Tao;PAN Wen-lin(School of Mathematics and Computer Science,Yunnan Minzu University,Kunming 650500,China)
机构地区:[1]云南民族大学数学与计算机科学学院,云南昆明650500
出 处:《云南民族大学学报(自然科学版)》2020年第5期493-500,共8页Journal of Yunnan Minzu University:Natural Sciences Edition
基 金:国家自然科学基金(61363022)。
摘 要:音素是一个语言体系中最小的语音单位,音素识别在大词汇语音识别任务中不受词汇和语句的限制.因此,选择音素作为识别单元,建立基于CNN-BGRU的神经网络模型,实现音素语谱图的分类.首先,使用短时傅里叶变换生成音素语谱图作为模型的输入;其次建立CNN-BGRU模型,利用改进的VGGNet模型提取音素语谱图的特征,再使用双向门控循环单元(BGRU)实现音素语谱图的序列信息表示;最后,通过Softmax分类器实现音素语谱图的分类.实验使用TIMIT英语语音数据集进行音素语谱图识别,准确率达到98.6%,优于CNN(VGG16)、CNN-RNN、CNN-BRNN、CNN-BLSTM这4个模型.Phoneme is the smallest phonetic unit in a language system.Phoneme recognition is not restricted by words and sentences in the task of large vocabulary speech recognition.Therefore,in this paper,phoneme is selected as the recognition unit,and a neural network model based on CNN-BGRU is established to realize the classification of the phonemic spectrum.Firstly,the short-time Fourier transform is used to generate the phonemic spectrum as the input of the model.Secondly,the CNN-BGRU model is established to extract the features of the phonemic spectrum by using the improved VGGNet model,and then the sequence information representation of the phonemic spectrum is realized by using the two-way gated loop unit(BGRU).Finally,Softmax classifier is used to realize the classification of the phonemic spectrum.In the experiment,TIMIT English speech data set is used for phoneme recognition with an accuracy of 98.6%,which is better than CNN(VGG16),CNN-RNN,CNN-BRNN or CNN-BLSTM.
分 类 号:TN912.34[电子电信—通信与信息系统]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.145.40.61