检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:蔡昱东 刘雪 廖祥 周艺 CAI Yudong;LIU Xue;LIAO Xiang;ZHOU Yi(Guangxi Key Laboratory of Special Biomedicine&Advanced Institute for Brain and Intelligence,School of Medicine,Guangxi University,Nanning 530004,P.R.China;Department of Biomedical Engineering and Imaging Medicine,Army Medical University,Chongqing 400038,P.R.China;Centre for Neurointelligence Research,School of Medicine,Chongqing University,Chongqing 400030,P.R.China;Department of Neurobiology,School of Basic Medicine,Army Medical University,Chongqing 400038,P.R.China)
机构地区:[1]广西大学医学院、广西特色生物医药重点实验室,广西大学脑与智能研究中心,南宁530004 [2]陆军军医大学生物医学工程与医学成像系,重庆400038 [3]重庆大学医学院神经智能研究中心,重庆400030 [4]陆军军医大学基础医学院神经生物学教研室,重庆400038
出 处:《生物医学工程学杂志》2025年第1期82-89,共8页Journal of Biomedical Engineering
基 金:国家自然科学基金(32171001,32371050);广西科技基地和人才专项资助(任务书编号:桂科AD22035948)。
摘 要:人脑对语音信息的处理机制对于语音增强技术的研究具有重要启发意义,注意力和侧抑制都是听觉信息处理中的关键机制,可选择性增强特定信息。基于此,本研究提出了一种结合侧抑制机制和反馈式注意力机制的双支U-Net神经网络。含噪语音信号输入第一支U-Net网络后,结果中高置信度的时频单元被选择性反馈,产生的激活层梯度结合侧抑制机制被用于计算注意力图。将该注意力图拼接至第二支U-Net网络,用于指导网络注意力,从而实现对语音信号的选择性增强。我们采用语音质量感知评价等5个指标来评价语音增强效果。与Wiener、SEGAN、PHASEN、Demucs、GRN等5种方法进行横向比较。实验结果发现,本文提出的方法在各类噪声场景下对语音信号的增强能力相较于基准网络在各项指标上提升了18%~21%,特别是在低信噪比条件下的表现显著优于其他方法。基于侧抑制和反馈式注意力机制的语音增强技术在语音增强方面具有重要的潜力,可用于人工耳蜗、助听器等相关临床实践。The processing mechanism of the human brain for speech information is a significant source of inspiration for the study of speech enhancement technology.Attention and lateral inhibition are key mechanisms in auditory information processing that can selectively enhance specific information.Building on this,the study introduces a dual-branch U-Net that integrates lateral inhibition and feedback-driven attention mechanisms.Noisy speech signals input into the first branch of the U-Net led to the selective feedback of time-frequency units with high confidence.The generated activation layer gradients,in conjunction with the lateral inhibition mechanism,were utilized to calculate attention maps.These maps were then concatenated to the second branch of the U-Net,directing the network’s focus and achieving selective enhancement of auditory speech signals.The evaluation of the speech enhancement effect was conducted by utilising five metrics,including perceptual evaluation of speech quality.This method was compared horizontally with five other methods:Wiener,SEGAN,PHASEN,Demucs and GRN.The experimental results demonstrated that the proposed method improved speech signal enhancement capabilities in various noise scenarios by 18%to 21%compared to the baseline network across multiple performance metrics.This improvement was particularly notable in low signal-to-noise ratio conditions,where the proposed method exhibited a significant performance advantage over other methods.The speech enhancement technique based on lateral inhibition and feedback-driven attention mechanisms holds significant potential in auditory speech enhancement,making it suitable for clinical practices related to artificial cochleae and hearing aids.
关 键 词:听觉言语 语音增强 侧抑制 注意力 卷积神经网络
分 类 号:TP183[自动化与计算机技术—控制理论与控制工程] TN912.35[自动化与计算机技术—控制科学与工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.145.95.6