一种基于重构性深度网络的MELP语音编码改进算法  被引量:2

Improved MELP Algorithm Based on Reconstructive Deep Neural Network

在线阅读下载全文

作  者:张雄伟[1] 吴海佳[1] 张梁梁[1] 邹霞[1] 

机构地区:[1]解放军理工大学指挥信息系统学院,南京210007

出  处:《数据采集与处理》2015年第2期307-318,共12页Journal of Data Acquisition and Processing

基  金:国家自然科学基金(61471394)资助项目;江苏省自然科学基金(BK2012510)资助项目;国家自然科学青年基金(61402519)资助项目;江苏省自然科学青年基金(BK20140071;BK20140074)资助项目

摘  要:为了提高深度模型的编码重构性能,本文为传统对比散度(Contrastive divergence,CD)添加了基于交叉熵的重构误差约束。利用改进后的算法训练了重构性深度自编码机(Reconstructive deep auto-encoder,RDAE),并用RDAE替换混合激励线性预测编码(Mixed excitation linear prediction,MELP)语音编码器中LSF参数的矢量量化方法。测试结果表明,改进后的算法在损失一定模型似然度的条件下获得了重构性能的提升,当RDAE隐藏层结点设为19bit时,本文方法所测得的加权LSF距离、重构语音质量、谱失真指标在训练集和测试集上均优于25bit矢量量化方法,即利用本文方法改进的MELP编码器,在不降低语音质量的条件下,可将MELP编码速率从2.4kb/s降低至2.1kb/s,编码速率降低了12.5%。In order to improve the reconstruction performance of deep models,reconstruction error constraint based on cross entropy is added to traditional contrastive divergence(CD)algorithm.The improved algorithm is used to train reconstructive deep auto-encoder(RDAE),which is used to replace the vector quantization method for LSF in MELP speech coding algorithm.Experimental results show that the improved CD algorithm improves the deep model gain reconstruction performance while costing some likelihood of the model.When the node number of the hidden layer of RDAE is set to 19 bit,the indicators,which include the weighted LSF distance,the performance of reconstructed speech,and the spectrum distortion,perform better in both training set and testing set by the proposed method than by the vector quantization method at 25 bit.That is to say,the coding bitrate of the MELP coder is reduced from 2.5kb/s to 2.1kb/s.The reduction rate of the coding bitrate is up to 12.5%,while the speech quality remains.

关 键 词:深度学习 深度自编码机 重构性 低速率语音编码 混合激励线性预测 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象