用于孤立数字语音识别的一种组合降维方法  被引量:9

Combined Dimension Reduction Method for Isolated Digital Speech Recognition

在线阅读下载全文

作  者:宋青松[1] 田正鑫[1] 孙文磊[1] 吴小杰[1] 安毅生[1] 

机构地区:[1]长安大学信息工程学院,西安710064

出  处:《西安交通大学学报》2016年第6期42-46,共5页Journal of Xi'an Jiaotong University

基  金:国家自然科学基金资助项目(61201406);中国博士后科学基金资助项目(2013M531998);中央高校基本科研业务费专项资金资助项目(310824162022;310824162021)

摘  要:针对孤立数字语音识别的噪声鲁棒性问题,提出了一个组合降维方法。该方法由梅尔频率倒谱系数(MFCC)特征提取、线性降维、受限玻尔兹曼机(RBM)、Softmax分类器4个功能模块依次组成;基于主成分分析(PCA)基本原理对MFCC特征向量实现了降维并且统一维度的目的;通过RBM对降维后的特征向量进行学习,改善了后端Softmax分类器的分类性能,RBM的预训练由对比散度算法完成,微调过程使用共轭梯度算法。采用TI-46孤立数字语音库和NOISEX-92典型噪声数据库对方法进行了测试,实验结果表明,该方法可以获得96.09%的正确识别率,相对于常规神经网络识别方法,噪声鲁棒性得到了提高。A combined dimension reduction method is proposed to improve the noise-robustness in isolated digital speech recognition.The method consists of four functional modules in sequence:a Mel frequency cepstrum coefficient(MFCC)module for feature extraction,a linear dimension reduction module,a restricted Boltzmann machine(RBM)module,and a Softmax classifier module.The dimension of the MFCC feature vector is reduced and its dimensionality is unified based on the basic principle of the principal component analysis(PCA);the obtained reduced features are learned by RBM in order to improve the classification performance of the end Softmax classifier module.The pretraining of the RBM is completed by the contrastive divergence algorithm and the finetuning process is fulfilled by the conjugate gradient algorithm.The proposed method is verified on the TI-46 isolated digital speech corpus and the NOISEX-92 noise datasets.The experimental results and comparisons with the conventional feedforward neural network methods show that the proposed method achieves at a 96.09% recognition accuracy and obtains improved noise robustness.

关 键 词:语音识别 主成分分析 受限玻尔兹曼机 

分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象