基于特征交互与自适应分组融合的多模态目标检测

Multimodal Object Detection Based on Feature Interaction and Adaptive Grouping Fusion

作　　者：叶志晖武健赵晓忠[1] 王文娟[1] 邵新光 YE Zhihui;WU Jian;ZHAO Xiaozhong;WANG Wenjuan;SHAO Xinguang(China Tobacco Zhejiang Industrial Co.LTD.,Hangzhou 310008,China;Polytechnic Instiute,Zhejiang University,Hangzhou 310058,China)

机构地区：[1]浙江中烟工业有限责任公司,浙江杭州310008 [2]浙江大学工程师学院,浙江杭州310058

出　　处：《红外技术》2025年第4期468-474,共7页Infrared Technology

基　　金：国家自然科学基金(62002320)。

摘　　要：为提升目标检测方法在复杂场景下的检测效果,将深度学习算法与多模态信息融合技术相结合,提出了一种基于特征交互与自适应分组融合的多模态目标检测模型。模型采用红外和可见光目标图像为输入,以PP-LCNet网络为基础构建对称双支路特征提取结构,并引入特征交互模块,保证不同模态目标特征在提取过程中的信息互补;其次,设计二值化分组注意力机制,利用全局池化结合Sign函数将交互模块的输出特征以所属目标类别进行特征分组,再分别采用空间注意力机制增强各特征组中的目标信息;最后,基于分组增强后的特征,提取不同尺度下的同类特征组,通过自适应加权方式由深至浅进行多尺度融合,并根据融合后的各尺度特征实现目标预测。实验结果表明,所提方法在多模态特征交互、关键特征增强以及多尺度融合方面都有较大的提升作用,并且在复杂场景下,模型也具有更高的鲁棒性,可以更好地适用于不同场景中。To improve the performance of object detection methods in complex scenes,a multimodal object detection model based on feature interaction and adaptive grouping fusion is proposed by combining deep learning algorithms with multimodal information fusion technology.The model uses infrared and visible object images as inputs,constructs a symmetrical dual-branch feature extraction structure based on the PP-LCNet network,and introduces a feature interaction module to ensure complementary information between different modal object features during the extraction process.Secondly,a binary grouping attention mechanism was designed.Global pooling combined with the sign function was used to group the output features of the interaction module into their respective object categories,and spatial attention mechanisms were used to enhance the object information in each group of features.Finally,based on the group-enhanced features,similar feature groups at different scales were extracted,and multi-scale fusion was carried out through adaptive weighting from deep to shallow.Object prediction was then achieved based on the fused features at each scale.The experimental results show that the proposed method significantly improves multimodal feature interaction,key feature enhancement,and multi-scale fusion.Moreover,in complex scenarios,the model exhibits higher robustness and can be better applied to different scenarios.

关键词：多模态目标检测特征交互二值化分组自适应融合

分类号：TP391.41[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于特征交互与自适应分组融合的多模态目标检测

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于特征交互与自适应分组融合的多模态目标检测

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索