基于DCT分带谱熵与信号分解的高精度基音检测算法  被引量:5

Super Resolution Pitch Detection Based on Band-Partitioning Spectral Entropy and Signal Decomposition in DCT Domain

在线阅读下载全文

作  者:罗亚飞[1] 鲍长春[1] 

机构地区:[1]北京工业大学电子信息与控制工程学院,北京100022

出  处:《电子学报》2007年第1期13-22,共10页Acta Electronica Sinica

基  金:国家自然科学基金(No.60372063);北京市自然科学基金(No.4042009);北京市教委科技发展计划(No.KM200710005001)

摘  要:本文就低速率WI语音编码中的基音检测技术进行研究,针对基音检测在不同噪声与信噪比下容易发生清浊误判的问题,在基音检测前端引入基于DCT分带谱熵的语音检测算法划分语音段与非语音段;为了向基音检测算法提供更能准确反映基音周期实际变化的输入语音,基于谐波-噪声模型提出了一种改进的DCT域语音分解算法.然后,根据变形的MCAMDF(Modified Circular Average Magnitude Difference Function)与NCCF(Normalized Cross-Correlation Function)的峰值共性,结合上述两项基音检测前端处理技术,提出了MCAMDF-NCCF基音检测组合算法.为了满足不同环境下WI编码器对基音检测高精度的要求,在合成端更准确地恢复相位轨迹,本文又基于MCAMDF-NCCF算法提出了高精度MCAMDF-NCCF-FRAC基音检测算法以计算分数基音.将算法应用与2kb/s WI编码器,主观A/B听力测试结果表明,本文提出的基音检测算法在低信噪比下明显抑制了基音加倍减半及清浊误判现象的发生,得到了优异的基音检测结果,合成语音质量完全满足低速率WI编码器对基音检测技术的要求.In this paper, the research focuses on pitch detection techniques of the low-rate WI speech coding. As the pitch doubling and halving problems of pitch detection often occurred with varied noises and Signal to Noise Ratio (SNR), voice activity detection (VAD) algorithm based on DCT band-partitioning spectral entropy is employed in pre-processing to separate speech and non-sppech segments. In order to provide an accurate-pitch-cycle speech for pith detection algorithm,an improved speech decomposition algorithm in DCT domain based on the Harmonic-Noise Model is presented. Then, using the same characteristic of maximum peaks of MCAMDF and NCCF and two pro-processing techniques mentioned above, a pitch detection algorithm in a combination both of two functions together named MCAMDF-NCCF is proposed.In order to satisfy the needs of the pitch accuracy of WI coder and synthesize phase track correctly, a super resolution pitch detection algorithm named MCAMDF-NCCF-FRAC based on MCAMDF-NCCF is also given to get fractional pitch.We applied these algorithms to WI coder,the results from the subjective A/B listening test indicated that both of these two algorithms have a great performance and heavily reduce pitch doubling and halving and voiced-unvoiced error in low SNR, the quality of the synthesized speech satisfies the accuracy of the pitch detection techniques of WI coder completely.

关 键 词:语音编码 基音检测 语音检测 信号分解 波形内插 

分 类 号:TN912.3[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象