检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王茂 何勇[1] WANG Mao;HE Yong(College of Computer Science and Technology,Guizhou University,Guiyang 550025,China)
机构地区:[1]贵州大学计算机科学与技术学院,贵阳550025
出 处:《智能计算机与应用》2024年第8期40-47,共8页Intelligent Computer and Applications
基 金:贵州省科技计划项目(黔科合支撑[2020]2Y007号)。
摘 要:在声纹识别研究中,针对声纹信号特征的表征能力不足,模型的识别准确率不高的问题,提出基于CNN卷积神经网络的声纹识别方法,使用能体现更多声音本质的FBank梅尔语谱图特征作为模型的输入;此外,大多数研究为提高识别率而广泛使用多层堆叠的网络结构,使得网络参数量与计算量较大,难以部署到计算资源和存储资源紧缺的边缘智能设备。为此,提出采用分组卷积的方式对CNN标准主干网络结构进行优化,降低网络参数量和计算量;同时为了保证网络模型的识别准确率,采用CBAM注意力机制进一步优化网络,使其关注通道和空间中更有价值的地方。经实验验证,所提方法有较高的声纹识别准确率,且相较于标准CNN,优化后的模型参数量与计算量均有较大程度的减少。In the study of voice print recognition,on the one hand,aiming at the problems of insufficient characterization ability of voice print signal features and low recognition accuracy of the model,a voice print recognition method based on CNN convolutional neural network is proposed,and FBank Meir spectrogram features that can reflect more sound essence are used as the input of the model.On the other hand,in order to improve the recognition rate,the multi-layer stacked network structure is widely used in most current researches,which makes the number of network parameters and FLOPS(floating point of per second)large,and difficult to deploy to the edge intelligent devices with scarce computing resources and storage resources.Therefore,a grouping convolution method is proposed to optimize the CNN standard backbone network structure to reduce the number of network parameters and the amount of FLOPS.At the same time,in order to ensure the recognition accuracy of the network model,CBAM attention mechanism is used to further optimize the network and make it focus on more valuable places in the channel and space.The experimental results show that the proposed method has a higher voire recognition accuracy,and compared with standard CNN,the number of parameters and FLOPS of the optimized model are greatly reduced.
关 键 词:声纹识别 卷积神经网络 FBank特征 分组卷积 CBAM注意力机制
分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222