Frequency-Quantized Variational Autoencoder Based on 2D-FFT for Enhanced Image Reconstruction and Generation  

在线阅读下载全文

作  者:Jianxin Feng Xiaoyao Liu 

机构地区:[1]School of Information Engineering,Dalian University,Dalian,116622,China [2]Key Laboratory of Communication and Networks,Dalian University,Dalian,116622,China

出  处:《Computers, Materials & Continua》2025年第5期2087-2107,共21页计算机、材料和连续体(英文)

基  金:supported by the Interdisciplinary project of Dalian University DLUXK-2023-ZD-001.

摘  要:As a form of discrete representation learning,Vector Quantized Variational Autoencoders(VQ-VAE)have increasingly been applied to generative and multimodal tasks due to their ease of embedding and representative capacity.However,existing VQ-VAEs often perform quantization in the spatial domain,ignoring global structural information and potentially suffering from codebook collapse and information coupling issues.This paper proposes a frequency quantized variational autoencoder(FQ-VAE)to address these issues.The proposed method transforms image features into linear combinations in the frequency domain using a 2D fast Fourier transform(2D-FFT)and performs adaptive quantization on these frequency components to preserve image’s global relationships.The codebook is dynamically optimized to avoid collapse and information coupling issue by considering the usage frequency and dependency of code vectors.Furthermore,we introduce a post-processing module based on graph convolutional networks to further improve reconstruction quality.Experimental results on four public datasets demonstrate that the proposed method outperforms state-of-the-art approaches in terms of Structural Similarity Index(SSIM),Learned Perceptual Image Patch Similarity(LPIPS),and Reconstruction Fréchet Inception Distance(rFID).In the experiments on the CIFAR-10 dataset,compared to the baselinemethod VQ-VAE,the proposedmethod improves the abovemetrics by 4.9%,36.4%,and 52.8%,respectively.

关 键 词:VAE 2D-FFT image reconstruction image generation 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象