基于多任务卷积神经网络的红外与可见光多分辨率图像融合  被引量:8

A Multi-Task Convolutional Neural Network for Infrared and Visible Multi-Resolution Image Fusion

在线阅读下载全文

作  者:朱雯青 张宁[1,2,3] 李争 刘鹏[1,3] 汤心溢[1,3] ZHU Wen-qing;ZHANG Ning;LI Zheng;LIU Peng;TANG Xin-yi(Shanghai Institute of Technical Physics,Chinese Academy of Sciences,Shanghai 200083,China;University of Chinese Academy of Sciences,Beijing 100049,China;Key Laboratory of Infrared System Detection and Imaging Technology,Chinese Academy of Sciences,Shanghai 200083,China)

机构地区:[1]中国科学院上海技术物理研究所,上海200083 [2]中国科学院大学,北京100049 [3]中国科学院红外探测与成像技术重点实验室,上海200083

出  处:《光谱学与光谱分析》2023年第1期289-296,共8页Spectroscopy and Spectral Analysis

基  金:国家“十三五”预研基金项目(104040402)资助。

摘  要:红外与可见光图像融合一直是图像领域研究的热点,融合技术能弥补单一传感器的不足,为图像理解与分析提供良好的成像基础。因生产工艺以及成本的限制,红外探测器的分辨率远低于可见光探测器,并在一定程度上因源图像分辨率的差异阻碍了实际应用。针对红外与可见光图像分辨率不一致的问题,提出了用于红外图像超分辨率重建与融合的多任务卷积网络框架,应用于多分辨率图像融合。在网络结构方面,首先设计了双通道网络分别提取红外与可见光特征,使算法不受源图像分辨率的限制;其次提出了特征上采样模块,先用双线性插值方法增加像素个数,再通过多层感知器精细化拟合像素平滑空间与高频空间的映射关系,无需重新训练模型即可实现任意尺度的红外图像上采样;接着将线性注意力引入网络,学习特征空间位置间的非线性关系,抑制无关信息并增强网络对全局信息的表达。在损失函数方面,提出了梯度损失,保留红外与可见光图像中绝对值较大的滤波器响应值,并计算该值与重建的融合图像响应值的Frobenius范数,无需理想的融合图像作为真值监督网络学习就能生成融合图像;此外,在梯度损失、像素损失的共同作用下对多任务模型进行优化,可以同时重建融合图像和高分辨率红外图像。算法在RoadScene数据集上进行训练,与其他4种相关算法在TNO数据集上进行对比,主观性能上该方法可以输入任意分辨率的源图像,融合图像红外目标突出、可见光细节纹理丰富,在源图像分辨率相差较大时能重建特征清晰的高分辨率红外图像,模型泛化性能强;客观性能上在信息熵、差异相关性总量、空间频率等多个评价指标上表现优异,结果表明重建的融合图像信息丰富、信息转化率高、清晰度高,验证了算法的有效性。Infrared and visible image fusion have always been a research hotspot in the image field.Fusion technology can compensate for a single sensor’s deficiency and provide good imaging pandation for image understanding and analysis.Due to the limitation of production technology and cost,the resolution of infrared detectors is much lower than that of visible detectors,which prevents practical usage to a great extent.A multi-task convolutional neural network framework combining infrared super-resolution and image fusion tasks is proposed,which is applied to the infrared and visible multi-resolution image fusion.In terms of network structure,firstly,a dual-channel network is designed to extract infrared and visible features respectively,so that the resolution of each source image does not limit the proposed algorithm.Secondly,the feature up-sampling block is proposed,using the bilinear interpolation method to increase the number of pixels.Then the mapping relationship between pixel smooth space and high-frequency space is refined via a multilayer perceptron.Therefore,the infrared images can be presented on an arbitrary scale,where the training tasks are not provided.Furthermore,the linear self-attention mechanism is introduced into the network to learn the nonlinear relationship between feature space positions,suppress irrelevant information and enhance global information expression.In terms of the loss function,the gradient loss is proposed to retain the filter response with larger absolute values in the infrared and visible images and calculate the Frobenius norm between the value and the response value of the reconstructed fusion image.Thus,fusion images can be generated without ideal images as ground truth supervising network learning.Finally,the fused and high-resolution infrared images can be reconstructed simultaneously by optimizing the multi-task model under the combined action of gradient loss and pixel loss.The proposed approach is trained on the RoadScene dataset and compared with the other four related alg

关 键 词:红外与可见光融合 多分辨率图像融合 线性注意力 梯度损失 红外图像超分辨率 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象