检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:叶中付[1,2] 赵紫微 于润祥 YE Zhongfu;ZHAO Ziwei;YU Runxiang(Department of Electronic Engineering and Information Science,University of Science and Technology of China,Hefei 230022,China;National Engineering Research Center of Speech and Language Information Processing,Hefei 230022,China)
机构地区:[1]中国科学技术大学电子工程与信息科学系,合肥230022 [2]语音及语言信息处理国家工程研究中心,合肥230022
出 处:《数据采集与处理》2023年第2期262-273,共12页Journal of Data Acquisition and Processing
基 金:国家自然科学基金(61671418)。
摘 要:针对目前主流的双支路单通道语音增强方法只关注全频带信息而忽略子频带信息这一问题,设计了一种基于人耳临界频带的交互性双支路模型。主要做法为,在复数谱支路上实施模拟人耳临界频带的划分方法对信号进行分频带处理,提取子带信息;在幅度补偿支路上直接对信号的全频带进行处理,提取全频带信息。复数谱支路负责初步恢复干净语音的幅度和相位,同时,该支路上学到的子带中间特征会被特定的模块传递给幅度补偿支路进行补偿;幅度补偿支路上的输出会对复数谱支路上输出的幅度做进一步的补偿,达到恢复干净语音频谱的目的。实验结果表明,提出的模型在恢复语音质量和可懂度方面优于其他先进的单通道语音增强模型。Aiming at the problem that the current mainstream dual-branch single-channel speech enhancement methods only pay attention to the full frequency band information while ignoring the subband information,an interactive dual-branch model based on the critical frequency band of the human ear is proposed.The main method is to implement the division method of simulating the critical frequency band of the human ear on the complex spectrum branch to process the signal in frequency division and extract sub-band information.The whole frequency band of the signal is directly processed on the amplitude compensation branch,and the information of the whole frequency band is extracted.The complex spectrum branch is responsible for initially recovering the amplitude and phase of the clean speech signal.At the same time,the subband intermediate features learned by the branch are transferred to the amplitude compensation branch by specific modules for compensation.The output on the amplitude compensation branch will further compensate the amplitude of the output on the complex spectrum branch to achieve the purpose of recovering the clean speech spectrum.Experimental results show that the proposed model is superior to other advanced models in restoring speech quality and intelligibility.
分 类 号:TN912.35[电子电信—通信与信息系统] TP183[电子电信—信息与通信工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.139.61.71