基于全局依赖Transformer的图像超分辨率网络  

Image super-resolution network based on global dependency Transformer

在线阅读下载全文

作  者:刘子涵 周登文[1] 刘玉铠 LIU Zihan;ZHOU Dengwen;LIU Yukai(School of Control and Computer Engineering,North China Electric Power University,Beijing 102206,China)

机构地区:[1]华北电力大学控制与计算机工程学院,北京102206

出  处:《计算机应用》2024年第5期1588-1596,共9页journal of Computer Applications

摘  要:目前,基于深度学习的图像超分辨网络主要由卷积实现。相较于传统的卷积神经网络(CNN),Transformer在图像超分辨率任务中的主要优势是它的长距离依赖建模能力;然而大多数基于Transformer的图像超分辨率模型在参数量小、网络层数少的情况下无法建立全局依赖,限制了模型的性能。为了在超分辨率网络中建立全局依赖,提出了基于全局依赖Transformer的图像超分辨率网络(GDTSR),主要组成部分为残差方形轴向窗口块(RSAWB),它的内部轴向窗口Transformer残差层利用轴向窗口和自注意力,可以使每个像素与整个特征图建立起全局依赖。此外,目前大多数图像超分辨率模型的超分辨率图像重建模块都由卷积组成,为了动态整合提取到的特征信息,结合Transformer与卷积,共同重建超分辨率图像。实验结果表明,GDTSR在5个标准测试集Set5、Set14、B100、Urban100和Manga109上的测试结果中,3个倍数(×2,×3,×4)的峰值信噪比(PSNR)和结构相似性(SSIM)均达到了最优,特别是在大尺寸图像的Urban100和Manga109数据集上模型性能的提升尤为明显。At present,the image super-resolution networks based on deep learning are mainly implemented by convolution.Compared with the traditional Convolutional Neural Network(CNN),the main advantage of Transformer in the image super-resolution task is its long-distance dependency modeling ability.However,most Transformer-based image superresolution models cannot establish global dependencies with small parameters and few network layers,which limits the performance of the model.In order to establish global dependencies in super-resolution network,an image Super-Resolution network based on Global Dependency Transformer(GDTSR)was proposed.Its main component was the Residual Square Axial Window Block(RSAWB),and in Transformer residual layer,axial window and self-attention were used to make each pixel globally dependent on the entire feature map.In addition,the super-resolution image reconstruction modules of most current image super-resolution models are composed of convolutions.In order to dynamically integrate the extracted feature information,Transformer and convolution were combined to jointly reconstruct super-resolution images.Experimental results show that the Peak Signal-to-Noise Ratio(PSNR)and Structural Similarity Index(SSIM)of GDTSR on five standard test sets,including Set5,Set14,B100,Urban100 and Manga109,are optimal for three multiples(×2,×3,×4),and on largescale datasets Urban100 and Manga109,the performance improvement is especially obvious.

关 键 词:图像超分辨率 TRANSFORMER 自注意力 全局依赖 轴向窗口 

分 类 号:TP391.4[自动化与计算机技术—计算机应用技术] TP183[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象