基于时序卷积生成对抗网络的单通道音域分离

Speech music separation method based on joint training and timing convolution to generate confrontation network

作　　者：郁文虎全海燕[1] YU Wen-hu;QUAN Hai-yan(Faculty of Information Engineering&Automation,Kunming University of Science and Technology,Kunming 650500,Yunnan,China)

机构地区：[1]昆明理工大学信息工程与自动化学院,云南昆明650500

出　　处：《云南大学学报（自然科学版）》2023年第1期48-56,共9页Journal of Yunnan University(Natural Sciences Edition)

基　　金：国家自然科学基金(61861023)。

摘　　要：由于音域信号的语音和音乐常常以混叠的形式出现,因此在许多应用中,希望能有效分离音域信号中的语音和音乐.普通的分离方法一般采用基于频域信号的处理方式,而频域信号还原时需借助相位信息,导致还原的信息有偏差.针对时域单通道音域信号分离效果差的问题,提出在对抗生成网络中引入联合训练与时序卷积的方法.首先,对时域语音进行预处理;然后,将预处理过的数据送入时序卷积生成对抗网络生成器中进行分离;最后,将分离的干扰语音和纯净的干扰语音送到生成对抗网络判别器判别,并把判别结果反馈给生成器.实验采用MIR-1K和data_thchs30数据集进行算法性能测试,结果表明,提出的单通道音域分离模型的PESQ和STOI指标平均提高了0.31和0.07,证明所提算法有效提升了音域信号中语音和音乐的分离效果.Because the voice and music of the range signal often appear in the form of aliasing,it is hoped to effectively separate the voice and music in the range signal in many applications.However,the common separation method generally adopts the processing method based on frequency domain signal,and the frequency domain signal restoration needs the help of phase information,resulting in the deviation of the restored speech information.Therefore,a joint training and temporal convolution approach is proposed to introduce in the adversarial generative network for the problem of of poor separation effect of time domain single channel tone domain signal separation.Firstly,the time domain speech is preprocessed.Then,the preprocessed data is sent to the time series convolutional generative adversarial network generator for separation.Finally,the separated interference speech and pure interference speech are sent to the generative adversarial network discriminator for discrimination,and feed the discriminant results back to the generator.The experiment adopts MIR-1K and data_thchs30 dataset for algorithm performance test.The experimental results show that the PESQ and STOI indexes of the single channel range separation model proposed in this paper are improved by 0.31 and 0.07,which proves that the proposed algorithm effectively improves the separation effect of speech and music in the range signal.

关键词：时序卷积联合训练生成对抗网络音域分离

分类号：TN912[电子电信—通信与信息系统]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于时序卷积生成对抗网络的单通道音域分离

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于时序卷积生成对抗网络的单通道音域分离

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索