检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张天骐[1] 罗庆予 方蓉 张慧芝 ZHANG Tianqi;LUO Qingyu;FANG Rong;ZHANG Huizhi(School of Communication and Information Engineering,Chongqing University of Posts and Telecommunications(CQUPT),Chongqing 400065,China)
机构地区:[1]重庆邮电大学通信与信息工程学院,重庆400065
出 处:《信号处理》2023年第7期1285-1298,共14页Journal of Signal Processing
基 金:国家自然科学基金项目(61671095,61702065,61701067,61771085);信号与信息处理重庆市市级重点实验室建设项目(CSTC2009CA2003);重庆市自然基金项目(cstc2021jcyj-msxmX0836);重庆市教育委员会科研项目(KJ1600427,KJ1600429)。
摘 要:针对语音增强的深层神经网络中对丰富的全局语音相关信息提取困难、未充分利用中间层次特征的问题,本文以尽可能小的参数为前提,基于注意力U型网络,设计了一种基于信息提炼和残差特征聚合的新型卷积编解码网络来进行语音增强。本文在编解码部分提出一种2维的层次细化残差(HRR,Hierarchical Refinement Residual)模块,该模块能显著降低训练参数并扩大感受野,对多尺度上下文信息进行不同层次的提取;传输层提出一种轻量级的1维通道自适应注意力(1D-CAA,One-Dimensional Channel Dimension Adaptive Attention)模块,结合门控机制和范数归一化,选择性地传递特征并提高网络表达能力,并联合门控残差线性单元搭建了一种门控残差特征聚合(GRFA,Gating Residual Feature Aggregation)网络,增强了层间信息流动并充分利用中间层次特征细节,获取更多时序相关信息。实验部分,本文在21种噪声环境下训练和测试,最终以1.23×106的参数相比于其他方法取得更优的客观与主观指标,具备较强的增强效果与泛化能力,并在模型复杂度与精度上取得良好平衡。To address the problem of difficult extraction of rich global speech-related information and underutilization of intermedi⁃ate level features in deep neural networks for speech enhancement,this paper designed a novel convolutional codec network based on information refinement and aggregation of residual features for speech enhancement based on attention U-Net with the smallest possible parameters.The mentioned network proposed a Two-Dimensional Hierarchical Refined Residual(HRR)module in the codec part,which could significantly reduce the training parameters and expanded the perceptual field to extract multi-scale contex⁃tual information at different levels;A lightweight One-Dimensional Channel Dimension Adaptive Attention(1D-CAA)module was proposed in the transmission layer,combining gating mechanism and parametric normalization to selectively deliver features and improve network expression capability,and a Gating Residual Feature Aggregation(GRFA)network was built jointly with gating residual linear units to enhance inter-layer information flow and make full use of intermediate level feature details.Residual feature aggregation network,which enhanced the information flow between layers and made full use of the intermediate level feature details to obtain more time-series relevant information.In the experimental part,this paper was trained and tested in 21 noisy environ⁃ments,and finally achieved better objective and subjective indexes with 1.23×106 parameters compared with other methods,with strong enhancement effect and generalization ability,and a good balance of model complexity and accuracy.
关 键 词:语音增强 多尺度上下文 自适应注意力机制 残差特征聚合
分 类 号:TN911.7[电子电信—通信与信息系统]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.127