基于选择状态空间的去噪扩散概率模型研究  

Research on the Probability Model of Denoising Diffusion Based on Selection State Space

作  者:佘志用 康家荣 张东坡 郭晓新[3] SHE Zhiyong;KANG Jiarong;ZHANG Dongpo;GUO Xiaoxin(College of Information Network Security,Xinjiang University of Political Science and Law,Tumushuk 844000,China;College of Mathematics and Electronic Information Engineering,Guangxi Minzu Normal University,Chongzuo 532200,China;College of Computer Science and Technology,Jilin University,Changchun 130012,China)

机构地区:[1]新疆政法学院信息网络安全学院,新疆图木舒克844000 [2]广西民族师范学院数理与电子信息工程学院,广西崇左532200 [3]吉林大学计算机科学与技术学院,吉林长春130012

出  处:《山西大学学报(自然科学版)》2025年第1期120-129,共10页Journal of Shanxi University(Natural Science Edition)

基  金:国家自然科学基金(82071995);新疆政法学院校长基金(XZZK2021002,XZZK2022008)。

摘  要:针对去噪扩散概率模型(DDPM)的采样效率低、训练时间长和硬件资源开销大等问题,提出了选择状态空间的去噪扩散概率模型。该方法首先用选择状态空间模型(SSM)的动态选择性来提高DDPM在长序列上的采样效率;其次通过多方向扫描图像,使得DDPM扩散时获取更多有效的图像信息;最后利用SSM的线性时间复杂以及并行运算减少DDPM训练时的时间和硬件资源的开销。DDPM、改进的扩散模型(DDIM)、变分自编码器的去噪扩散概率模型(VAE-DDPM)、视觉Transformer的去噪扩散概率模型(VIT-DDPM)和本文方法在ImageNet和人脸图像数据集(FFHQ)数据集进行图像生成实验时,对不同分辨率图像的Frechet距离(FID)、结构相似性(SSIM)、峰值信噪比(PSNR)和生成时间等参数对比分析,本文方法生成128×128图像时FID、SSIM和PSNR的值分别提升了5.6%~22.6%、4.7%~15.5%和1.9%~6.6%,得出该方法能够有效解决DDPM的缺陷,并且优于其他扩散模型。Aiming at the problems of low sampling efficiency,long training time,and high hardware resource overhead of the Denoising Diffusion Probability Model(DDPM),a state space based DDPM is proposed.First,this method uses the dynamic selectivity of the State Space Model(SSM)to improve the sampling efficiency of DDPM on long sequences;Second,by scanning the image in multiple directions,more effective image information can be obtained during the diffusion of DDPM;Finally,the time and hardware resource overhead during DDPM training were reduced by utilizing the linear time complexity and parallel computing of SSM.When conducting image generation experiments on ImageNet and Flickr-Faces-High-Quality(FFHQ)datasets using DDPM,Denoising Diffusion Implicit Models(DDIM),Variational Auto-Encoder-DDPM(VAE-DDPM),Vision Transformer-DDPM(VIT-DDPM),and our proposed method,a comparative analysis was conducted on parameters such as Frechet Inception Distance(FID),Structural Similarity Index(SSIM),Peak Signal-to-Noise Ratio(PSNR),and generation time of images with different resolutions.When generating 128×128 images,the values of FID,SSIM,and PSNR of our method increased by 5.6%-22.6%,4.7%-15.5%,and 1.9%-6.6%,respectively.It is concluded that this method can effectively solve the defects of DDPM and is superior to other diffusion models.

关 键 词:扩散模型 线性时间复杂度 多方向扫描 

分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象