基于小样本学习的语音端点检测  

Voice activity detection based on few⁃shot learning

在线阅读下载全文

作  者:单蒙 米吉提·阿不里米提[1] 艾斯卡尔·艾木都拉[1] SHAN Meng;MIJIT Ablimit;ASKAR Hamdulla(College of Information Science and Engineering,Xinjiang University,Urumqi 830046,China)

机构地区:[1]新疆大学信息科学与工程学院,新疆乌鲁木齐830046

出  处:《现代电子技术》2022年第24期145-150,共6页Modern Electronics Technique

基  金:国防科技基础加强计划(2021⁃JCJQ⁃JJ⁃0059);国家自然科学基金项目(U2003207)。

摘  要:语音端点检测作为语音信号处理前端处理部分的一个重要环节,是各种语音任务的基础。基于深度神经网络的语音端点检测在数据支撑上需要对语音进行大量帧级别的标注,针对此问题,文中提出一种基于原型网络(ProtoNet)的小样本学习(Few⁃shot Learning)的语音端点检测算法,进一步减少在语音端点检测算法过程中因帧级别数据标注带来的繁琐工作。该算法利用所给出的标签计算出一个分类中心,通过计算查询点到分类中心的距离将未给出标签的查询点归类到分类中心,得到一个原型中心;在测试集上,计算测试集中的查询点与原型中心的距离并进行测试。实验语料基于MUSAN语音库,使用该语音库自带的噪声库进行加噪。实验结果表明,在各种环境噪声下,基于小样本学习的语音端点检测算法的性能优于基于深度神经网络的语音端点检测算法,而且该算法能够显著减少语音端点检测算法的数据准备工作量与系统数据量。As a very important part of the front⁃end processing part of voice signal processing,VAD(voice activity detection)is the basis of various voice tasks.In the VAD based on deep neural network,voice is needed to be annotated at a large number of frame level on data support.On this basis,a VAD algorithm based on prototype network(ProtoNet)few⁃shot learning is proposed to further reduce the tedious work caused by frame level data annotation in the process of VAD algorithm.This algorithm can be used to calculate a classification center according to the given annotation,and then classify the query points without a annotation to the classification center by calculating the distance between the query point and the classification center,so as to obtain a prototype center.On the test set,the distance between the query point in the test set and the center of the prototype was calculated and tested.The experimental corpus is based on the MUSAN speech library,and the noise library included in the MUSAN speech library is used to add noise.The experimental results show that under various environmental noises,the performance of the VAD algorithm based on few⁃shot learning is better than that of the VAD algorithm based on deep neural network,and this algorithm can significantly reduce the data preparation workload and system data volume of VAD algorithm.

关 键 词:语音端点检测 原型网络 小样本学习 数据标注 语音信号处理 深度学习 结果分析 

分 类 号:TN911.23-34[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象