检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:黄晋维 鲍长春 周静 HUANG Jin-wei;BAO Chang-chun;ZHOU Jing(Institute of Speech and Audio Signal Processing,Faculty of Information Technology,Beijing University of Technology,Beijing 100124,China)
机构地区:[1]北京工业大学信息学部语音与音频信息处理研究所,北京100124
出 处:《电子学报》2024年第8期2581-2590,共10页Acta Electronica Sinica
基 金:国家自然科学基金(No.61831019)。
摘 要:对基于神经网络的丢包隐藏方法而言,输入特征是直接影响最终恢复效果的重要因素.此外,如何通过丢包隐藏恢复高自然度的语音,也是亟待解决的难题.为有效恢复丢包语音并提高自然度,本文提出了一种基于先验梅尔谱和神经声码器的语音丢包隐藏方法.该方法采用一种非对称的编解码网络结构.在编码端,用两个独立的编码网络分别从时域波形和梅尔谱中提取深层时频特征.在解码端,将时频深层特征一同送入由时序自适应反归一化层构成的声码器中,以恢复丢失的语音信号并提高自然度.仿真实验表明,该方法在语音感知质量和短时客观可懂度上均优于现有的两种丢包隐藏算法.For the neural network-based speech Packet Loss Concealment(PLC),the input features are crucial factors that directly affect the final recovery performance.Additionally,the challenge of restoring high natural speech through PLC remains to be addressed.To effectively recover packet loss speech and improve its naturalness,this paper proposes a PLC method of speech signal based on the priori Mel-spectrum and neural vocoder.The proposed method adopts an asymmetric encoding and decoding network structure.At the encoding stage,this method utilizes two independent encoding networks to extract the latent time-frequency features from the waveform and Mel-spectrogram,respectively.At the decoding stage,the latent time-frequency features are jointly fed into a neural vocoder which is composed of several temporal adaptive denor⁃malization layer to restore the lost speech signals and enhance the naturalness.Simulation experiments demonstrate that the proposed method outperforms two existing packet loss concealment algorithms in terms of perceptual evaluation of speech quality and short-time objective intelligibility.
关 键 词:丢包隐藏 先验梅尔谱 神经声码器 时序自适应反归一化层 时频特征
分 类 号:TN912[电子电信—通信与信息系统]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.138.101.237