基于伽玛通滤波器的双谱特征语音可懂度算法  

Bispectrum feature speech intelligibility algorithm using Gammatone filter

在线阅读下载全文

作  者:陈晓梅[1] 王晓玮 钟波[2] 杨佳燕 商莹莹[3] CHEN Xiao-mei;WANG Xiao-wei;ZHONG Bo;YANG Jia-yan;SHANG Ying-ying(Department of Electrical and Electronic Engineering,North China Electric Power University,Beijing 102206,China;Division of Mechanics and Acoustics Metrology,National Institute of Metrology,Beijing 100029,China;Department of Otolaryngology,Peking Union Medical College Hospital,Chinese Academy of Medical Sciences,Beijing 100730,China)

机构地区:[1]华北电力大学电气与电子工程学院,北京102206 [2]中国计量科学研究院力学与声学计量科学研究所,北京100029 [3]中国医学科学院北京协和医院耳鼻喉科,北京100730

出  处:《计算机工程与设计》2023年第5期1288-1296,共9页Computer Engineering and Design

基  金:国家重点研发计划基金项目(2020YFC2005200)。

摘  要:针对现有的语音可懂度评价方法不能真实贴近人耳对语音的感知过程,提出一种基于人耳听觉特性的双谱特征预测语音可懂度评价(Gammatone-bspectral speech intelligibility metric, GBSIM)算法。充分利用双谱可以检测语音信号中的非线性相位耦合,抑制非高斯信号中的高斯噪声的特性,采用可以模拟人工耳蜗模型的Gammatone滤波器组,通过滤波处理将输入的语音信号分为32个听觉子频带,用三阶统计量对每个子频带的语音信号进行双谱估计并提取单一特征值来计算语音的可懂度。实例验证结果表明,该方法对信号失真变化敏感,其评价结果与主观评价具有很高的相关度,相对于传统的语音可懂度评价算法具有更好的评价效果。Aiming at the fact that the existing speech intelligibility evaluation methods cannot truly be close to the human ear’s perception of speech,an algorithm for predicting speech intelligibility(Gammatone-bspectral speech intelligibility metric,GBSIM)was proposed.The bispectrum was utilized to detect the nonlinear phase coupling in the voice signal and suppress the characteristics of the Gaussian noise in the non-Gaussian signal.The Gammatone filter bank that could simulate the cochlear implant model was used to divide the input voice signal into 32 auditory sub-bands,third-order statistics was used to estimate the bispectrum of the speech signal in each sub-band and extract a single feature value to calculate the intelligibility of the speech.The examples show that the proposed method is sensitive to changes in signal distortion,and its evaluation results have a high degree of correlation with subjective evaluation with better evaluation effects than the traditional speech intelligibility evaluation algorithms.

关 键 词:语音可懂度 客观评价算法 非线性失真 听觉特性 Gammatone滤波器组 高阶统计量 双谱 

分 类 号:TN912.3[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象