Microphone Array-Based Sound Source Localization Using Convolutional Residual Network  被引量:1

在线阅读下载全文

作  者:Ziyi Wang Xiaoyan Zhao Hongjun Rong Ying Tong Jingang Shi 

机构地区:[1]School of Information and Communication Engineering,Nanjing Institute of Technology,Nanjing,211167,China [2]University of Oulu,Oulu,90014,FI,Finland

出  处:《Journal of New Media》2022年第3期145-153,共9页新媒体杂志(英文)

基  金:supported by Nature Science Research Project of Higher Education Institutions in Jiangsu Province under Grant No.21KJB510018;National Nature Science Foundation of China (NSFC)under Grant No.62001215.

摘  要:Microphone array-based sound source localization(SSL)is widely used in a variety of occasions such as video conferencing,robotic hearing,speech enhancement,speech recognition and so on.The traditional SSL methods cannot achieve satisfactory performance in adverse noisy and reverberant environments.In order to improve localization performance,a novel SSL algorithm using convolutional residual network(CRN)is proposed in this paper.The spatial features including time difference of arrivals(TDOAs)between microphone pairs and steered response power-phase transform(SRPPHAT)spatial spectrum are extracted in each Gammatone sub-band.The spatial features of different sub-bands with a frame are combine into a feature matrix as the input of CRN.The proposed algorithm employ CRN to fuse the spatial features.Since the CRN introduces the residual structure on the basis of the convolutional network,it reduce the difficulty of training procedure and accelerate the convergence of the model.A CRN model is learned from the training data in various reverberation and noise environments to establish the mapping regularity between the input feature and the sound azimuth.Through simulation verification,compared with the methods using traditional deep neural network,the proposed algorithm can achieve a better localization performance in SSL task,and provide better generalization capacity to untrained noise and reverberation.

关 键 词:Convolutional residual network microphone array spatial features sound source localization 

分 类 号:TP3[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象