基于矢量量化的时序说话人聚类方法被引量：5

Sequential Speaker Clustering Based on Vector Quantization

出　　处：《科学技术与工程》2014年第2期41-44,共4页Science Technology and Engineering

基　　金：国家自然科学基金项目(61101160);广州市珠江科技新星专项(2013J2200070);中央高校基本科研业务费专项资金重点项目(2013zz0053);国家级大学生创新训练项目(201210561046);广东省大学生创新训练项目(1056112028)资助

摘　　要：针对传统分层聚类方法运算速度较慢的问题,提出一种基于矢量量化的时序说话人聚类方法。首先对各语音段的特征进行矢量量化得到各语音段的码本,然后采用贝叶斯信息判据计算各码本之间的距离,最后按时间先后顺序进行说话人聚类。采用会议和新闻语音数据进行测试,实验结果表明:会议语音的说话人聚类F值为73.47%,新闻语音的说话人聚类F值为80.00%;在处理速度方面,该方法比无矢量量化时序聚类方法提高了3.16倍,比传统分层聚类方法提高了53.31倍。In view of slow speed of the traditional hierarchical clustering method, a method of sequential speak- er clustering based on vector quantization is proposed. First, vector quantization is used for obtaining codebooks by coding features extracted from speech segments. Then, according to Bayesian information criterion, it calculates the BIC distances between any two eodebooks. Finally, sequential speaker clustering is carried out. The experimental results, by testing meeting and news speech data, show that： the proposed method obtains F-score of 73.47% for meeting speech data and 80. 00% for news speech data. In terms of processing speed, the proposed method speeds up 3.16 times compared to the sequential clustering method without vector quantization, and 53.31 times compared to the traditional hierarchical clustering method.

关键词：时序说话人聚类矢量量化贝叶斯信息判据梅尔频率倒谱系数

分类号：TN912.3[电子电信—通信与信息系统]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于矢量量化的时序说话人聚类方法被引量：5

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于矢量量化的时序说话人聚类方法 被引量：5

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于矢量量化的时序说话人聚类方法被引量：5