Monaural voiced speech segregation based on elaborate harmonic grouping strategies

Monaural voiced speech segregation based on elaborate harmonic grouping strategies

作　　者：LIU WenJu ZHANG XueLiang JIANG Wei LI Peng XU Bo

机构地区：[1]National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Bcijing 100190, China [2]Digital Media Content Technology Research Center, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China

出　　处：《Science China(Information Sciences)》2011年第12期2471-2480,共10页中国科学（信息科学）（英文版）

基　　金：supported by the National Natural Science Foundation of China(Grant Nos.60675026,90820303,90820011)

摘　　要：In this paper, an enhanced algorithm based on several elaborate harmonic grouping strategies for monaural voiced speech segregation is proposed. Main achievements of the proposed algorithm lie in three aspects. Firstly, the algorithm classifies the time-frequency （T-F） units into resolved and unresolved ones by carrier-to-envelope energy ratio, which leads to more accurate classification results than by cross-channel correlation. Secondly, resolved T-F units are grouped together according to minimum amplitude principle, which has been verified to exist in human perception, as well as the harmonic principle. Finally, ＂enhanced＂ envelope autocorrelation function is employed to detect amplitude modulation rates, which helps a lot in reducing half-frequency error in grouping of unresolved units. Systematic evaluation and comparison show that performance of separation is greatly improved by the proposed algorithm. Specifically, signal-to-noise ratio （SNR） is improved by 0.96 dB compared with that of previous method. Besides, our algorithm is also effective in improving the PESQ score and subjective perception score.In this paper, an enhanced algorithm based on several elaborate harmonic grouping strategies for monaural voiced speech segregation is proposed. Main achievements of the proposed algorithm lie in three aspects. Firstly, the algorithm classifies the time-frequency （T-F） units into resolved and unresolved ones by carrier-to-envelope energy ratio, which leads to more accurate classification results than by cross-channel correlation. Secondly, resolved T-F units are grouped together according to minimum amplitude principle, which has been verified to exist in human perception, as well as the harmonic principle. Finally, ＂enhanced＂ envelope autocorrelation function is employed to detect amplitude modulation rates, which helps a lot in reducing half-frequency error in grouping of unresolved units. Systematic evaluation and comparison show that performance of separation is greatly improved by the proposed algorithm. Specifically, signal-to-noise ratio （SNR） is improved by 0.96 dB compared with that of previous method. Besides, our algorithm is also effective in improving the PESQ score and subjective perception score.

关键词：computational auditory scene analysis voiced speech separation harmonistic principle minimum amplitude principle elaborate harmonic grouping strategies

分类号：TN722.75[电子电信—电路与系统] TM864[电气工程—高电压与绝缘技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

Monaural voiced speech segregation based on elaborate harmonic grouping strategies

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

Monaural voiced speech segregation based on elaborate harmonic grouping strategies

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索