机构地区:[1]南京信息工程大学计算机学院,南京210044 [2]西北大学新闻传播学院,西安710127
出 处:《中国图象图形学报》2025年第3期784-797,共14页Journal of Image and Graphics
基 金:国家自然科学基金项目(U20B2065)。
摘 要:目的针对在超分辨率任务中,Transformer模型存在特征提取模式单一、重建图像高频细节丢失和结构失真的问题,提出了一种融合通道注意力的跨尺度Transformer图像超分辨率重建模型。方法模型由4个模块组成:浅层特征提取、跨尺度深层特征提取、多级特征融合以及高质量重建模块。浅层特征提取利用卷积处理早期图像,获得更稳定的输出;跨尺度深层特征提取利用跨尺度Transformer和强化通道注意力机制,扩大感受野并通过加权筛选提取不同尺度特征以便融合;多级特征融合模块利用强化通道注意力机制,实现对不同尺度特征通道权重的动态调整,促进模型对丰富上下文信息的学习,增强模型在图像超分辨率重建任务中的能力。结果在Set5、Set14、BSD100(Berkeley segmentation dataset 100)、Urban100(urban scene 100)和Manga109标准数据集上的模型评估结果表明,相较于SwinIR超分辨率模型,所提模型在峰值信噪比上提高了0.06~0.25 dB,且重建图像视觉效果更好。结论提出的融合通道注意力的跨尺度Transformer图像超分辨率重建模型,通过融合卷积特征与Transformer特征,并利用强化通道注意力机制减少图像中噪声和冗余信息,降低模型产生图像模糊失真的可能性,图像超分辨率性能有效提升,在多个公共实验数据集的测试结果验证了所提模型的有效性。Objective The image super-resolution reconstruction technique refers to a method for converting low-resolution(LR)images to high-resolution(HR)images within the same scene.In recent years,this technique has been widely used in computer vision,image processing,and other fields due to its wide practical application value and far-reaching theoretical importance.Although the model based on convolutional neural networks has made remarkable progress,most superresolution network structures remain in a single-layer level end-to-end format to improve the reconstruction performance.This approach often overlooks the multilayer level feature information during the network reconstruction process,limiting the reconstruction performance of the model.With the advancement of deep learning technology,Transformer-based network architectures have been introduced into the field of computer vision,yielding substantial results.Researchers have applied Transform models to underlying vision tasks,including image super-resolution reconstruction.However,in this context,the Transformer model suffers from a single feature extraction pattern,loss of high-frequency details in the reconstructed image,and structural distortion.A cross-scale Transformer image super-resolution reconstruction model with fusion channel attention is proposed to address these problems.Method The model comprises the following four modules:shallow feature extraction,cross-scale deep feature extraction,multilevel feature fusion,and a high-quality reconstruction module.Shallow feature extraction uses convolution to process early images to obtain highly stable outputs,and the convolutional layer can provide stable optimization and extraction results during early visual feature processing.The cross-scale deep feature extraction module uses the cross-scale Transformer and the enhanced channel attention mechanism to acquire features at different scales.The core of the cross-scale Transformer lies in the cross-scale self-attention mechanism and the gated convolutional feedforw
关 键 词:图像超分辨率 跨尺度Transformer 通道注意力机制 特征融合 深度学习
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...