检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:沈恺涛 闵天悦 胡德敏[2] SHEN Kai-tao;MIN Tian-yue;HU De-min(Information Office,University of Shanghai for Science & Technology;School of Optical-Electrical & Computer Engineering,University of Shanghai for Science & Technology,Shanghai 200093,China)
机构地区:[1]上海理工大学信息化办公室 [2]上海理工大学光电信息与计算机工程学院,上海200093
出 处:《软件导刊》2023年第2期21-27,共7页Software Guide
基 金:国家自然科学基金项目(61170277,61472256);上海市教委科研创新重点项目(12zz137);上海市一流学科建设项目(S1201YLXK)。
摘 要:卷积神经网络的注意力机制模型重建波长范围广的红外图像时只能聚焦于局部特征、感受野小,为此提出一种适用于重建广范围红外图像的融合轻量级视觉Transformer(ViT)与卷积神经网络的模型。该模型采用改进的轻量级残差块结合轻量级ViT块构建全局自注意力机制模型,学习不同特征图区域之间的长距离注意力依赖关系以辅助重建,约束解空间,采用Huber损失函数使模型稳定收敛,通过迭代上下采样的方式挖掘高低分辨率图像对的深层变换关系。使用近红外图像和远红外图像数据集进行实验,该模型以1 031K的参数量在峰值信噪比和结构相似度比较中超越了参数量为1 518K的轻量级模型SRResNet和1 592K的CARN,接近于参数量为4 543K的重量级模型EDSR,表明该模型可以有效重建不同波长的红外图像。In order to solve the problem that the attention mechanism model of convolution neural networks(CNN) can only focus on local features and small receptive field when reconstructing infrared images with a wide wavelength range, propose a new method with lightweight ViT and CNN suitable for reconstructing infrared images with a wide range. The model used an improved lightweight residual block combined with a lightweight ViT block to build a global self-attention mechanism model, learned long-distance attention dependencies between different feature map regions to assist reconstruction and constrain the solution space. It used Huber loss function to make the model converge stably. It mined the deep transformation relationship between high and low resolution image pairs by iterative up and down sampling. Near-infrared images and far-infrared images datasets were used in the experiment, the model with 1 031K parameters surpassed the lightweight model SRResNet with 1 518K parameters and CARN with 1 592K parameters in the comparison of peak signal-to-noise ratio and structural similarity, close to the heavyweight model EDSR with a parameter amount of 4 543K,which shows that the model can effectively reconstruct infrared images with a wide wavelength range.
关 键 词:红外图像 轻量 视觉Transformer 超分辨率 自注意力
分 类 号:TP183[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.91