基于心理声学模型的高性能语音质量评价算法  被引量:1

High-Performance Algorithm for Speech Quality Evaluation Based on Psychoacoustics Model

在线阅读下载全文

作  者:张军[1] 张德运[1] 高磊[1] 赵东平[1] 

机构地区:[1]西安交通大学电子与信息工程学院,西安710049

出  处:《西安交通大学学报》2006年第4期437-440,共4页Journal of Xi'an Jiaotong University

基  金:国家高技术研究发展计划资助项目(2003AA148010)

摘  要:提出了一种高效心理声学模型语音质量评价(EPM-SQE)算法.该算法采用12阶美尔倒谱参数(MFCC)作为语音信号特征向量,其空间复杂度小于巴克谱.对MFCC进行相对谱(RASTA)滤波,可以突出快变信号对听觉感知的影响.将滤波后的参数映射为响度,由此模拟人的感知过程.计算原始语音和受损语音响度之间的感知扰动,并依次在频域和时域进行聚合,从而获得单一的扰动值,该值再经认知模型计算,可以得到最终的客观评分.实验表明,所提算法的平均运行时间比国际电信联盟提出的语音质量感知评价算法减少了41%,内存占用降低了51%,而仅比主观评价的相关度下降6.8%.A speech quality evaluation algorithm based on efficient psychoacoustics model, EPMSQE, was proposed. The algorithm adopts Mel frequency cepstrum coefficients (MFCC), which has lower space complexity than Bark spectrum, as feature representations for speech samples. The influences of rapidly changing signals to human perception can be stressed by filtering the relative spectra (RASTA) for MFCC. Mapping the filtered MFCC to loudness the process of human perception is simulated. The absolute differences between the original and the degraded speech loudness, named perceptual disturbance, are aggregated over frequency and time respectively to generate a single-value disturbance. Then the cognitive model gives the final objective evaluation score. The experimental results show that EPM-SQE gives an average reduction of 41 % in time and 51% in storage space compared to the algorithm PESQ proposed by ITU-T, and the correlation of subjective evaluation decreases only by 6.8%.

关 键 词:心理声学模型 美尔倒谱 感知扰动 质量评价 

分 类 号:TN912[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象