基于深度神经网络和多特征融合的语音端点检测被引量：3

Voice Activity Detection Based on Deep Neural Network and Multi-feature Fusion

作　　者：陈爱华[1] 张石清[1] Chen Aihua;Zhang Shiqing(School of Electronics and Information Engineering,Taizhou University,Taizhou 318000,China)

机构地区：[1]台州学院电子与信息工程学院,浙江台州318000

出　　处：《台州学院学报》2021年第3期1-6,33,共7页Journal of Taizhou University

基　　金：国家自然科学基金资助项目(61976149);浙江省自然科学基金(LZ20F020002)。

摘　　要：针对当前单一的语音端点检测算法检测率低、鲁棒性较差的问题,提出一种基于深度神经网络和多特征融合的语音端点检测算法。该算法首先分别采用Gammatone滤波、Gabor滤波和LTSV滤波算法提取音频文件的耳蜗特征、短时特征以及长时变化特征;接着对三种特征做均一化处理,并在融合后将其作为深度神经网络的输入信号;然后采用事先训练好的深度神经网络模型,计算语音片段中属于语音/非语音的概率,进而确定语音片段的属性;最后采用中值滤波去掉误检测点,完成语音端点检测。为了验证算法的有效性,选取多个环境下采集的语音信号进行仿真实验。实验结果表明,该算法可以在噪声强度较高的环境下实现语音端点检测,并具有较好的准确性和鲁棒性。Aiming at the problems that single voice activity detection algorithm has low accuracy and poor robust‐ness,voice activity detection based on deep neural network and multi-feature fusion is proposed.Firstly,Gammatone filter,Gabor filter and LTSV filter are used to extract human auditory feature,short-term feature and long-term signal variability feature of audio file.Then,these three feature signals are homogenized and fused as the input signal of deep neural network.The trained deep neural network model calculates the probability that the speech segment belongs to speech/non speech,and then determines the attributes of the speech segment.Finally,the median filter is used to remove the false detection points and complete voice activity detection.In order to verify the effectiveness of proposed algorithm,this paper selects speech signal streams in multiple environments for simulation experiments.The results show that the proposed algorithm can achieve voice activity detection in the environment with high noise intensity,and has good accuracy and robustness.

关键词：深度神经网络 Gammatone滤波 GABOR滤波 LTSV滤波语音端点检测

分类号：TN912.34[电子电信—通信与信息系统] TP183[电子电信—信息与通信工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于深度神经网络和多特征融合的语音端点检测被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于深度神经网络和多特征融合的语音端点检测 被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于深度神经网络和多特征融合的语音端点检测被引量：3