基于临界频带的交互性双支路单通道语音增强模型被引量：3

Interactive Dual‑Branch Monaural Speech Enhancement Model Based on Critical Frequency Band

作　　者：叶中付[1,2] 赵紫微于润祥 YE Zhongfu;ZHAO Ziwei;YU Runxiang(Department of Electronic Engineering and Information Science,University of Science and Technology of China,Hefei 230022,China;National Engineering Research Center of Speech and Language Information Processing,Hefei 230022,China)

机构地区：[1]中国科学技术大学电子工程与信息科学系,合肥230022 [2]语音及语言信息处理国家工程研究中心,合肥230022

出　　处：《数据采集与处理》2023年第2期262-273,共12页Journal of Data Acquisition and Processing

基　　金：国家自然科学基金(61671418)。

摘　　要：针对目前主流的双支路单通道语音增强方法只关注全频带信息而忽略子频带信息这一问题,设计了一种基于人耳临界频带的交互性双支路模型。主要做法为,在复数谱支路上实施模拟人耳临界频带的划分方法对信号进行分频带处理,提取子带信息;在幅度补偿支路上直接对信号的全频带进行处理,提取全频带信息。复数谱支路负责初步恢复干净语音的幅度和相位,同时,该支路上学到的子带中间特征会被特定的模块传递给幅度补偿支路进行补偿;幅度补偿支路上的输出会对复数谱支路上输出的幅度做进一步的补偿,达到恢复干净语音频谱的目的。实验结果表明,提出的模型在恢复语音质量和可懂度方面优于其他先进的单通道语音增强模型。Aiming at the problem that the current mainstream dual-branch single-channel speech enhancement methods only pay attention to the full frequency band information while ignoring the subband information,an interactive dual-branch model based on the critical frequency band of the human ear is proposed.The main method is to implement the division method of simulating the critical frequency band of the human ear on the complex spectrum branch to process the signal in frequency division and extract sub-band information.The whole frequency band of the signal is directly processed on the amplitude compensation branch,and the information of the whole frequency band is extracted.The complex spectrum branch is responsible for initially recovering the amplitude and phase of the clean speech signal.At the same time,the subband intermediate features learned by the branch are transferred to the amplitude compensation branch by specific modules for compensation.The output on the amplitude compensation branch will further compensate the amplitude of the output on the complex spectrum branch to achieve the purpose of recovering the clean speech spectrum.Experimental results show that the proposed model is superior to other advanced models in restoring speech quality and intelligibility.

关键词：临界频带交互性子带双支路单通道语音增强

分类号：TN912.35[电子电信—通信与信息系统] TP183[电子电信—信息与通信工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于临界频带的交互性双支路单通道语音增强模型被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于临界频带的交互性双支路单通道语音增强模型 被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于临界频带的交互性双支路单通道语音增强模型被引量：3