基于BIC和G_PLDA的说话人分离技术研究  被引量:7

The research of speaker diarization based on BIC and G_PLDA

在线阅读下载全文

作  者:李锐[1] 卓著[1] 李辉[1] 

机构地区:[1]中国科学技术大学电子科学与技术系,安徽合肥230027

出  处:《中国科学技术大学学报》2015年第4期286-293,共8页JUSTC

摘  要:传统的以贝叶斯信息准则(Bayesian information criterion,BIC)作为相似性度量的说话人分离技术,在短时对话的分离任务中能取得较好的效果,但是随着对话时长的增加,BIC的单高斯模型不足以描述不同说话人数据的分布,且层次聚类(Hierarchical agglomerative clustering,HAC)时,区分相同说话人和不同说话人的门限值难以划定.针对此问题,提出基于短时BIC和长时G_PLDA的融合方法,充分利用BIC在短时聚类的可靠性和G_PLDA在长时段上的优异区分性,在美国国家标准技术局(NIST)08Summed测试集上的实验表明,该方法将分类错误率(DER)从BIC基线系统的2.34%降到1.54%,性能相对提升34.2%.The traditional technology for speaker diarization(SD), which exploits the Bayesian iniormauon criterion(BIC) as the similarity metric, can obtain good results in the short dialogue task, but with the length of the dialogue increasing , single Gaussian model of BIC is insufficient to describe the information distribution of different speakers. Moreover, it is difficult to delineate the threshold between the same speakers and different speakers when using hierarchical clustering (HAC). To solve this problem, a fusion method between BIC and G_PLDA was proposed, so as to make full use of the reliability of BIC in short- term clustering and the excellent discriminating power of G_PLDA in long utterancs. A set of experiments based on NIST 08 Summed shows that this new fusion method reduces the diariazation error rate (DER) from 2.34 ~ of BIC baseline system to 1.54 ~, improving performance of speaker diarization by 34.2 ~.

关 键 词:说话人分离 贝叶斯信息准则 高斯概率线性判别分析 分类错误率 

分 类 号:TN912.34[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象