基于多尺度及多头注意力的红外与可见光图像融合

Infrared and Visible Images Fusion Method Based on Multi-Scale Features and Multi-head Attention

作　　者：李秋恒邓豪[1,2] 刘桂华庞忠祥[3] 唐雪赵俊琴卢梦圆 LI Qiuheng;DENG Hao;LIU Guihua;PANG Zhongxiang;TANG Xue;ZHAO Junqin;LU Mengyuan(School of Information Engineering,Southwest University of Science and Technology,Mianyang 621010,China;Sichuan Key Laboratory of Special Environmental Robotics,Southwest University of Science and Technology,Mianyang 621010,China;China Telecom Corporation,Chengdu Branch,Chengdu 610066,China;Institute of Aerospace Technology,China Aerodynamics Research and Development Center,Mianyang 621006,China)

机构地区：[1]西南科技大学信息工程学院,四川绵阳621010 [2]特殊环境机器人技术四川省重点实验室,四川绵阳621010 [3]中国电信股份有限公司成都分公司,四川成都610066 [4]中国空气动力研究与发展中心空天技术研究所,四川绵阳621006

出　　处：《红外技术》2024年第7期765-774,共10页Infrared Technology

基　　金：装备预先研究共用技术项目(50927010302)。

摘　　要：针对红外与可见光图像融合容易出现细节丢失,且现有的融合策略难以平衡视觉细节特征和红外目标特征等问题,提出一种基于多尺度特征融合与高效多头自注意力相结合的红外与可见光图像融合方法。首先,为提高目标与场景的描述能力,采用了多尺度编码网络提取源图像不同尺度的特征;其次,提出了基于Transformer的多头转置注意力结合残差密集块的融合策略以平衡融合细节与整体结构;最后,将多尺度特征融合图输入基于巢式连接的解码网络,重建具有显著红外目标和丰富细节信息的融合图像。基于TNO与M^(3) FD公开数据集与7种经典融合方法进行实验,结果表明,本文方法在视觉效果与量化评价指标上表现更佳,生成的融合图像在目标检测任务上取得更好的效果。To address the challenges of detail loss and the imbalance between visual detail features and infrared(IR)target features in fused infrared and visible images,this study proposes a fusion method combining multiscale feature fusion and efficient multi-head self-attention(EMSA).The method includes several key steps.(1)Multiscale coding network:It utilizes a multiscale coding network to extract multilevel features,enhancing the descriptive capability of the scene.(2)Fusion strategy:It combines transformer-based EMSA with dense residual blocks to address the imbalance between local details and overall structure in the fusion process.(3)Nested-connection based decoding network:It takes the multilevel fusion map and feeds it into a nested-connection based decoding network to reconstruct the fused result,emphasizing prominent IR targets and rich scene details.Extensive experiments on the TNO and M^(3)FD public datasets demonstrate the efficacy of the proposed method.It achieves superior results in both quantitative metrics and visual comparisons.Specifically,the proposed method excels in targeted detection tasks,demonstrating state-of-the-art performance.This approach not only enhances the fusion quality by effectively preserving detailed information and balancing visual and IR features but also establishes a benchmark in the field of infrared and visible image fusion.

关键词：图像融合红外与可见光图像多尺度特征多头自注意力 TRANSFORMER

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于多尺度及多头注意力的红外与可见光图像融合

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于多尺度及多头注意力的红外与可见光图像融合

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索