基于自适应GMM阶数与混合特征的说话人识别研究  

Research on Speaker Recognition Algorithm Based on Adaptive GMM Order and Hybrid Features

在线阅读下载全文

作  者:范涛 詹旭 FAN Tao;ZHAN Xu(School of Automation and Information Engineering,Sichuan University of Science&Engineering,Yibin 644000,China)

机构地区:[1]四川轻化工大学自动化与信息工程学院,四川宜宾644000

出  处:《四川轻化工大学学报(自然科学版)》2024年第4期75-83,共9页Journal of Sichuan University of Science & Engineering(Natural Science Edition)

基  金:四川省科技厅重点研发项目(2022YFS0554)。

摘  要:针对高斯混合模型(GMM)阶数选取缺陷和说话人特征信息不足的问题,提出了基于自适应GMM阶数和多种语音特征融合的说话人识别算法。首先,通过提取梅尔频率倒谱系数(MFCC)和线性预测梅尔频率倒谱系数(LPMFCC),并根据Fisher准则得到一个17维的MFCC和LPMFCC参数组合的混合特征参数,以增强说话人的特征信息。然后,根据自适应思想,在K-means聚类算法中计算簇内误差平方和(SSE)。最后,通过肘部法则自适应调整K值,以获得一个最优GMM阶数,使得系统在已有的声纹特征下获得最优的识别效果。结果表明,该算法不仅完善了说话人的特征信息,并且克服了对GMM阶数选取的缺陷。最终结合LPCC和MFCC两种特征算法,融合得到的混合特征LPMFCC+MFCC的识别率相比于LPCC和MFCC提升了26.34%和12.34%。Aiming at the problems of Gaussian mixture model(GMM)order selection defects and insufficient speaker feature information,a speaker recognition algorithm based on adaptive GMM order and fusion of multiple speech features has been proposed.Firstly,the Meier frequency cepstral coefficients(MFCC)and linear prediction Meier frequency cepstrum coefficient(LPMFCC)are extracted,and the 17-dimensional mixed feature parameter combination of MFCC and LPMFCC parameters is obtained according to the Fisher criterion,in order to enhance the feature information of the speaker.Then the sum of squared errors(SSE)in the cluster is calculated in the K-means clustering algorithm according to the adaptive idea.Lastly,the K value is adaptively adjusted by the elbow law to obtain an optimal GMM order,so that the system can obtain the optimal recognition effect under the existing voiceprint features.The results show that the algorithm not only improves the feature information of the speaker,but also overcomes the defects of GMM order selection.And,the hybrid feature LPMFCC+MFCC algorithm is obtained by fusing the two feature algorithms of LPCC and MFCC,whose recognition rate is increased by 26.34%and 12.34%respectively compared with LPCC and MFCC.

关 键 词:说话人识别 高斯混合模型 梅尔频率倒谱系数 线性预测梅尔系数 FISHER准则 自适应 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象