基于FFTNet-GAN的音频超分辨率方法研究  被引量:2

Research on Audio Super-resolution Method Based on FFTNet-GAN

在线阅读下载全文

作  者:徐峰 李平[1] XU Feng;LI Ping(Academy of Information Science and Engineering,Huaqiao University,Xiamen,Fujian 361021,China)

机构地区:[1]华侨大学信息科学与工程学院,福建厦门361021

出  处:《信号处理》2021年第1期59-65,共7页Journal of Signal Processing

基  金:福建省科技重大专项(2020HZ02014);福建省自然科学基金项目(2018J01095);福建省高校产学研合作科技重大项目(2013H6016);华侨大学中青年教师科技创新资助计划项目(ZQN-PY509)。

摘  要:本文提出了一种基于FFTNet的生成对抗网络模型来实现极端音频超分辨率任务。生成器采用并行、非因果、Non-local运算的三路分裂求和FFTNet,此浅层模型速度快,精度高,能更好的提取时域音频的长期相关结构,以期望分辨率提取特征,提升重建性能;设计匹配性能的判别器,稳定适应生成对抗架构;融合基于频域的感知损失,与样本空间损失固定加权减少重建失真和提高感知质量。从主客观进行系统评价,本文方法都优于基线模型,从2x/4x/6x倍还原效果来看,模型具有极端的高频重建能力,有助于提高音频信号的时间分辨率。This paper proposes a generative adversarial network model based on FFTNet to achieve extreme audio super-resolution tasks.The generator uses parallel,non-causal,and non-local three-way split-sum FFTNet.This shallow model is fast and accurate.It can better extract the long-term correlation structure of time-domain audio and extract features at the desired resolution,can help improve reconstruction performance.In addition,a discriminator with matching performance is designed to stably adapt to the generation adversarial architecture.Fusion based on the frequency domain perceptual loss,fixed weight with sample space loss to reduce reconstruction distortion and improve perceptual quality.From the subjective and objective system evaluation,the method in this paper is better than the baseline model.Judging from the 2x/4x/6x times reduction effect,the model has extreme high-frequency reconstruction ability,which helps to improve the time resolution of the audio signal.

关 键 词:音频超分辨率 带宽扩展 FFTNet 生成对抗网络 高频重建 

分 类 号:TP912[自动化与计算机技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象