检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:孙杰[1,2] 吾守尔.斯拉木 热依曼.吐尔逊 SUN Jie;Wushour Silamu;Reyiman Tursun(School of Information Science and Engineering,Xinjiang University,Urumqi 830046,China;Department of Physics,Changji University,Changji 831100,China)
机构地区:[1]新疆大学信息科学与工程学院,新疆乌鲁木齐830046 [2]昌吉学院物理系,新疆昌吉831100
出 处:《现代电子技术》2018年第24期132-136,140,共6页Modern Electronics Technique
基 金:国家重点基础研究发展计划("973"计划(2014CB340506));国家自然科学基金项目(61433012);国家自然科学基金项目(61363063);新疆维吾尔自治区重点实验室项目(2015KL013)~~
摘 要:少数民族语言进行语音识别时存在训练数据稀疏导致识别率低的问题。该文在对低资源的柯尔克孜语识别时,提出一种CMN网络构建跨语种声学模型。CMN网络模型利用CNN的局部采样和权值共享技术减少网络参数,并采用maxout神经元替换CNN的卷积核提高网络抽象特征提取能力。跨语种的CMN首先用资源相对丰富的维吾尔语进行预训练,为防止过拟合使用dropout正则化训练方法,并根据两种语言的相似性创建基于同义词强制对齐的音素映射集,然后标注待识别的柯尔克孜语数据,最后用有限的目标语语料对CMN网络参数进行微调。实验结果表明,所提CMN声学模型较基线CNN声学模型字错误率(WER)有8.3%的降低。As there exists low recognition rate caused by sparse training data during the speech recognition of minority lan-guages,a cross-language acoustic model based on convolutional maxout networks(CMNs)is constructed in this paper for less-re-source Kirgiz recognition.In the CMN model,the local sampling and weight sharing technologies of the convolutional neural net-work(CNN)are used to reduce network parameters.The convolutional kernel of the CNN is replaced by the maxout neuron to improve the extraction capability of network abstract features.The cross-language CMN is pre-trained by using the Uygur lan-guage with relatively-rich resources.The Dropout regularization training method is used to prevent over-fitting.The phoneme map-ping set based on forced alignment of synonyms is created according to the similarities of the two languages.The to-be recog-nized Kirgiz data is marked.The CMN parameters are fine-tuned by using the limited corpus of the target language.The experi-mental results show that the word error rate of the proposed CMN acoustic model is 8.3%lower than that of the baseline CNN acoustic model.
关 键 词:语音识别 低资源 柯尔克孜语 跨语种声学模型 CMN 音素映射
分 类 号:TN711-34[电子电信—电路与系统] TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49