Mapping methods for output-based objective speech quality assessment using data mining 被引量：2

Mapping methods for output-based objective speech quality assessment using data mining

机构地区：[1]School of Information and Technology,Beijing Institute of Technology

出　　处：《Journal of Central South University》2014年第5期1919-1926,共8页中南大学学报（英文版）

基　　金：Projects(61001188,1161140319)supported by the National Natural Science Foundation of China;Project(2012ZX03001034)supported by the National Science and Technology Major Project;Project(YETP1202)supported by Beijing Higher Education Young Elite Teacher Project,China

摘　　要：Objective speech quality is difficult to be measured without the input reference speech.Mapping methods using data mining are investigated and designed to improve the output-based speech quality assessment algorithm.The degraded speech is firstly separated into three classes(unvoiced,voiced and silence),and then the consistency measurement between the degraded speech signal and the pre-trained reference model for each class is calculated and mapped to an objective speech quality score using data mining.Fuzzy Gaussian mixture model(GMM)is used to generate the artificial reference model trained on perceptual linear predictive(PLP)features.The mean opinion score(MOS)mapping methods including multivariate non-linear regression(MNLR),fuzzy neural network(FNN)and support vector regression(SVR)are designed and compared with the standard ITU-T P.563 method.Experimental results show that the assessment methods with data mining perform better than ITU-T P.563.Moreover,FNN and SVR are more efficient than MNLR,and FNN performs best with 14.50% increase in the correlation coefficient and 32.76% decrease in the root-mean-square MOS error.Objective speech quality is difficult to be measured without the input reference speech. Mapping methods using data mining are investigated and designed to improve the output-based speech quality assessment algorithm. The degraded speech is firstly separated into three classes （unvoiced, voiced and silence）, and then the consistency measurement between the degraded speech signal and the pre-trained reference model for each class is calculated and mapped to an objective speech quality score using data mining. Fuzzy Gaussian mixture model （GMM） is used to generate the artificial reference model trained on perceptual linear predictive （PLP） features. The mean opinion score （MOS） mapping methods including multivariate non-linear regression （MNLR）, fuzzy neural network （FNN） and support vector regression （SVR） are designed and compared with the standard ITU-T P.563 method. Experimental results show that the assessment methods with data mining perform better than ITU-T P.563. Moreover, FNN and SVR are more efficient than MNLR, and FNN performs best with 14.50% increase in the correlation coefficient and 32.76% decrease in the root-mean-square MOS error.

关键词：objective speech quality data mining multivariate non-linear regression fuzzy neural network support vector regression

分类号：TN912.3[电子电信—通信与信息系统]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

Mapping methods for output-based objective speech quality assessment using data mining 被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

Mapping methods for output-based objective speech quality assessment using data mining 被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索