Audio-Visual Underdetermined Blind Source Separation Algorithm Based on Gaussian Potential Function  被引量:1

Audio-Visual Underdetermined Blind Source Separation Algorithm Based on Gaussian Potential Function

在线阅读下载全文

作  者:ZHANG Ye CAO Kang WU Kangrui YU Tenglong ZHOU Nanrun 

机构地区:[1]Department of Electronic Information Engineering, Nanchang University [2]National Engineering Laboratory for Disaster Backup and Recovery, Beijing University of Posts and Telecommunications

出  处:《China Communications》2014年第6期71-80,共10页中国通信(英文版)

基  金:supported by the National Natural Science Foundation of China(Grant Nos.61162014,61210306074);the Natural Science Foundation of Jiangxi Province of China(Grant No.20122BAB201025);the Foundation for Young Scientists of Jiangxi Province(Jinggang Star)(Grant No.20122BCB23002)

摘  要:Most existing algorithms for the underdetermined blind source separation(UBSS) problem are two-stage algorithm, i.e., mixing parameters estimation and sources estimation. In the mixing parameters estimation, the previously proposed traditional clustering algorithms are sensitive to the initializations of the mixing parameters. To reduce the sensitiveness to the initialization, we propose a new algorithm for the UBSS problem based on anechoic speech mixtures by employing the visual information, i.e., the interaural time difference(ITD) and the interaural level difference(ILD), as the initializations of the mixing parameters. In our algorithm, the video signals are utilized to estimate the distances between microphones and sources, and then the estimations of the ITD and ILD can be obtained. With the sparsity assumption in the time-frequency domain, the Gaussian potential function algorithm is utilized to estimate the mixing parameters by using the ITDs and ILDs as the initializations of the mixing parameters. And the time-frequency masking is used to recover the sources by evaluating the various ITDs and ILDs. Experimental results demonstrate the competitive performance of the proposed algorithm compared with the baseline algorithms.Most existing algorithms for the underdetermined blind source separation (UBSS) problem are two-stage algorithm, i.e., mixing parameters estimation and sources estimation. In the mixing parameters estimation, the previously proposed traditional clustering algorithms are sensitive to the initializations of the mixing parameters. To reduce the sensitiveness to the initialization, we propose a new algorithm for the UBSS problem based on anechoic speech mixtures by employing the visual information, i.e., the interaural time difference (ITD) and the interaural level difference (ILD), as the initializations of the mixing parameters. In our algorithm, the video signals are utilized to estimate the distances between microphones and sources, and then the estimations of the ITD and ILD can be obtained. With the sparsity assumption in the time-frequency domain, the Gaussian potential function algorithm is utilized to estimate the mixing parameters by using the ITDs and ILDs as the initializations of the mixing parameters. And the time-frequency masking is used to recover the sources by evaluating the various ITDs and ILDs. Experimental results demonstrate the competitive performance of the proposed algorithm compared with the baseline algorithms.

关 键 词:underdetermined blind sourceseparation interaural time difference interaural level difference visual information Gaussian potential function 

分 类 号:TN911.7[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象