基于FBank特征与改进CNN的声纹识别研究  被引量:1

Research on voiceprint recognition based on FBank feature and improved CNN

在线阅读下载全文

作  者:王茂 何勇[1] WANG Mao;HE Yong(College of Computer Science and Technology,Guizhou University,Guiyang 550025,China)

机构地区:[1]贵州大学计算机科学与技术学院,贵阳550025

出  处:《智能计算机与应用》2024年第8期40-47,共8页Intelligent Computer and Applications

基  金:贵州省科技计划项目(黔科合支撑[2020]2Y007号)。

摘  要:在声纹识别研究中,针对声纹信号特征的表征能力不足,模型的识别准确率不高的问题,提出基于CNN卷积神经网络的声纹识别方法,使用能体现更多声音本质的FBank梅尔语谱图特征作为模型的输入;此外,大多数研究为提高识别率而广泛使用多层堆叠的网络结构,使得网络参数量与计算量较大,难以部署到计算资源和存储资源紧缺的边缘智能设备。为此,提出采用分组卷积的方式对CNN标准主干网络结构进行优化,降低网络参数量和计算量;同时为了保证网络模型的识别准确率,采用CBAM注意力机制进一步优化网络,使其关注通道和空间中更有价值的地方。经实验验证,所提方法有较高的声纹识别准确率,且相较于标准CNN,优化后的模型参数量与计算量均有较大程度的减少。In the study of voice print recognition,on the one hand,aiming at the problems of insufficient characterization ability of voice print signal features and low recognition accuracy of the model,a voice print recognition method based on CNN convolutional neural network is proposed,and FBank Meir spectrogram features that can reflect more sound essence are used as the input of the model.On the other hand,in order to improve the recognition rate,the multi-layer stacked network structure is widely used in most current researches,which makes the number of network parameters and FLOPS(floating point of per second)large,and difficult to deploy to the edge intelligent devices with scarce computing resources and storage resources.Therefore,a grouping convolution method is proposed to optimize the CNN standard backbone network structure to reduce the number of network parameters and the amount of FLOPS.At the same time,in order to ensure the recognition accuracy of the network model,CBAM attention mechanism is used to further optimize the network and make it focus on more valuable places in the channel and space.The experimental results show that the proposed method has a higher voire recognition accuracy,and compared with standard CNN,the number of parameters and FLOPS of the optimized model are greatly reduced.

关 键 词:声纹识别 卷积神经网络 FBank特征 分组卷积 CBAM注意力机制 

分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象