Speech enhancement based on multitaper spectrum and psychoacoustical weighting rule  被引量:1

Speech enhancement based on multitaper spectrum and psychoacoustical weighting rule

在线阅读下载全文

作  者:WU Hongwei WU Zhenyang ZHAO Li 

机构地区:[1]College of Information Science and Engineering, Southeast University Nanjing 210096 [2]School of Electronics and Information, Suzhou University Suzhou 215021

出  处:《Chinese Journal of Acoustics》2007年第3期278-288,共11页声学学报(英文版)

基  金:This work was supported by 973-Project of China (No. 2002CB312102); by the National Natural Science Foundation of China (No. 60272044) ;by the Youth Research Fund of Suzhou University (No. Q3119610).

摘  要:Multitaper spectrum has lower variance than the traditional periodogram. The noise spectrum and the noise to noisy signal spectrum ratio (NNSR) were estimated from the multitaper spectrum of the noisy signal; the pre-enhanced speech for calculating the noise masking threshold was obtained by the spectral amplitude subtraction method, whose gain is a function of NNSR; the final enhanced speech was obtained by suppressing the Fourier spectrum of the noisy speech with the psychoacoustical weighting rule incorporating the noise masking threshold. Because of the low variance feature of the multitaper spectrum, a modified offset formula was proposed to calculate the noise masking threshold, thus the reconstructed speech with this modification has an improvement in MBSD (Modified Bark Spectral Distortion). When a maximum limitation less than one to the psychoacoustical weighting rule is further proposed, the higher the input SNR (〉 0 dB) is, the more improvement the segmental SNR and the overall SNR have. The informal listening tests show that there is little speech distortion for the enhanced speech processed by the proposed method, the background noise is reduced much and free of musical noise.Multitaper spectrum has lower variance than the traditional periodogram. The noise spectrum and the noise to noisy signal spectrum ratio (NNSR) were estimated from the multitaper spectrum of the noisy signal; the pre-enhanced speech for calculating the noise masking threshold was obtained by the spectral amplitude subtraction method, whose gain is a function of NNSR; the final enhanced speech was obtained by suppressing the Fourier spectrum of the noisy speech with the psychoacoustical weighting rule incorporating the noise masking threshold. Because of the low variance feature of the multitaper spectrum, a modified offset formula was proposed to calculate the noise masking threshold, thus the reconstructed speech with this modification has an improvement in MBSD (Modified Bark Spectral Distortion). When a maximum limitation less than one to the psychoacoustical weighting rule is further proposed, the higher the input SNR (〉 0 dB) is, the more improvement the segmental SNR and the overall SNR have. The informal listening tests show that there is little speech distortion for the enhanced speech processed by the proposed method, the background noise is reduced much and free of musical noise.

分 类 号:O42[理学—声学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象