检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:毛文青 管业鹏[1] Mao Wenqing;Guan Yepeng(College of Communication and Information Engineering,Shanghai University,Shanghai 200444,China)
机构地区:[1]上海大学通信与信息工程学院,上海200444
出 处:《电子测量技术》2020年第19期169-176,共8页Electronic Measurement Technology
基 金:国家自然科学基金(11176016,60872117);高等学校博士学科点专项科研基金(20123108110014)项目资助。
摘 要:为了解决特定说话人的高频信息无法被完全提取的问题,提出了一种新型的提取声道特征的方法,用于文本无关的说话人识别。首先提出了一组基于线性预测的梅尔频率倒谱系数(LPBMFCC)来消除干扰听觉能力的高频谐波,以区分两种不同的纯音,导出具有辨识性的声道特征。此外,提出利用多尺度小波分析来提取声源语音信号的时频特征作为LPBMFCC的补充特征。为了研究LPBMFCC和其他特征在说话人识别应用中的辨识能力,提出了一种基于距离测量的辨识力比较方案,可以在视觉上表示不同声学特征的分散。在基于高斯混合模型(GMM)的说话者识别系统的NIST 2008数据库上进行评估。实验结果表明,提出的LPBMFCC特征具有较强的辨识能力,与一些先进的方法相比,识别率高出5%~10%。而加入时频特征作为补充特征的LPBMFCC的识别率与不加时频特征时相比,识别率又有1%~4%的提高。因此,本文所提的方法具有更加优越的效果。In order to solve the problem that the high-frequency information of a specific speaker cannot be completely extracted,proposing a new method for extracting channel features for text-independent speaker recognition.First,a set of mel-frequency cepstrum coefficients(LPBMFCC)based on linear prediction is proposed to eliminate high-frequency harmonics that interfere with hearing ability,in order to distinguish two different pure tones,and to derive discernible channel characteristics.In addition,multi-scale wavelet analysis is proposed to extract the time-frequency characteristics of the sound source speech signal as a complementary feature of LPBMFCC.In order to study the recognition ability of LPBMFCC and other features in speaker recognition applications,proposing a discrimination scheme based on distance measurement to visually represent the dispersion of different acoustic features.Evaluating on the NIST 2008 database of the Gaussian mixture model(GMM)-based speaker recognition system.The experimental results show that the LPBMFCC feature proposed has strong recognition ability,and the recognition rate is 5%~10%higher than some advanced methods.The recognition rate of the LPBMFCC with the time-frequency feature as a supplementary feature is improved by 1%to 4%compared with that without the time-frequency feature.Therefore,the method proposed in this paper has more superior effects.
关 键 词:说话人识别 声道特征 声源特征 融合频率倒谱系数 时频特征
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.31