声源定位问题中卷积神经网络结构研究  

Research on Convolutional Neural Network Structures in Sound Source Localization

在线阅读下载全文

作  者:韦永刚 肖瑶[1] 

机构地区:[1]北京信息科技大学信息与通信工程学院,北京

出  处:《人工智能与机器人研究》2024年第4期961-969,共9页Artificial Intelligence and Robotics Research

摘  要:声源定位是声源信号处理中非常重要的研究目标。传统方法容易受到噪声和混响的干扰。随着深度学习算法在诸多领域的成功应用,本文探究了使用深度学习算法解决声源定位问题。本文对使用卷积神经网络结构实现基于麦克风信号的声源定位性能分析,并基于仿真实验探究相同房间条件和声源条件下,不同卷积层和卷积核数量对于声源定位性能的影响。实验表明,声音信号的基于相位变换加权的广义互相关特征作为卷积神经网络输入信号,在声音信噪比10 dB~40 dB,混响在200~600 ms的常规房间条件设定下,相比于其他方法其声源定位准确率最高,且卷积网络中包含6个卷积层,首层卷积层卷积核为4时其网络定位精度和计算效率之间取得了较好的平衡。Sound source localization is a crucial research objective in sound source signal processing. Traditional methods are prone to interference from noise and reverberation. With the successful application of deep learning algorithms in many fields, this paper explores the use of deep learning algorithms to solve the problem of sound source localization. This study analyzes the performance of sound source localization based on microphone signals using a convolutional neural network (CNN) structure. Through simulation experiments, we investigate the impact of different numbers of convolutional layers and convolutional kernels on sound source localization performance under the same room and sound source conditions. The experiments show that the sound signal after the generalized cross-correlation phase transform operation is used as the input signal of the convolutional neural network, undertypical room conditions with a signal-to-noise ratio of 10 dB~40 dB and reverberation times of 200~600 ms, this method achieves the highest localization accuracy compared to other methods. Furthermore, when the network contains 6 convolutional layers and the first layer has 4 convolutional kernels, a g

关 键 词:深度学习 卷积神经网络 声源定位 广义互相关特征 

分 类 号:TN9[电子电信—信息与通信工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象