基于LPBMFCC的文本无关说话人识别被引量：1

Text-independent speaker recognition based on LPBMFCC

作　　者：毛文青管业鹏[1] Mao Wenqing;Guan Yepeng(College of Communication and Information Engineering,Shanghai University,Shanghai 200444,China)

机构地区：[1]上海大学通信与信息工程学院,上海200444

出　　处：《电子测量技术》2020年第19期169-176,共8页Electronic Measurement Technology

基　　金：国家自然科学基金(11176016,60872117);高等学校博士学科点专项科研基金(20123108110014)项目资助。

摘　　要：为了解决特定说话人的高频信息无法被完全提取的问题,提出了一种新型的提取声道特征的方法,用于文本无关的说话人识别。首先提出了一组基于线性预测的梅尔频率倒谱系数(LPBMFCC)来消除干扰听觉能力的高频谐波,以区分两种不同的纯音,导出具有辨识性的声道特征。此外,提出利用多尺度小波分析来提取声源语音信号的时频特征作为LPBMFCC的补充特征。为了研究LPBMFCC和其他特征在说话人识别应用中的辨识能力,提出了一种基于距离测量的辨识力比较方案,可以在视觉上表示不同声学特征的分散。在基于高斯混合模型(GMM)的说话者识别系统的NIST 2008数据库上进行评估。实验结果表明,提出的LPBMFCC特征具有较强的辨识能力,与一些先进的方法相比,识别率高出5%~10%。而加入时频特征作为补充特征的LPBMFCC的识别率与不加时频特征时相比,识别率又有1%~4%的提高。因此,本文所提的方法具有更加优越的效果。In order to solve the problem that the high-frequency information of a specific speaker cannot be completely extracted,proposing a new method for extracting channel features for text-independent speaker recognition.First,a set of mel-frequency cepstrum coefficients(LPBMFCC)based on linear prediction is proposed to eliminate high-frequency harmonics that interfere with hearing ability,in order to distinguish two different pure tones,and to derive discernible channel characteristics.In addition,multi-scale wavelet analysis is proposed to extract the time-frequency characteristics of the sound source speech signal as a complementary feature of LPBMFCC.In order to study the recognition ability of LPBMFCC and other features in speaker recognition applications,proposing a discrimination scheme based on distance measurement to visually represent the dispersion of different acoustic features.Evaluating on the NIST 2008 database of the Gaussian mixture model(GMM)-based speaker recognition system.The experimental results show that the LPBMFCC feature proposed has strong recognition ability,and the recognition rate is 5%~10%higher than some advanced methods.The recognition rate of the LPBMFCC with the time-frequency feature as a supplementary feature is improved by 1%to 4%compared with that without the time-frequency feature.Therefore,the method proposed in this paper has more superior effects.

关键词：说话人识别声道特征声源特征融合频率倒谱系数时频特征

分类号：TP311[自动化与计算机技术—计算机软件与理论]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于LPBMFCC的文本无关说话人识别被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于LPBMFCC的文本无关说话人识别 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于LPBMFCC的文本无关说话人识别被引量：1