基于跨模态特征融合的RGB-D显著性目标检测  

RGB-D salient object detection based on cross-modal feature fusion

在线阅读下载全文

作  者:李可新 何丽[1] 刘哲凝 钟润豪 Li Kexin;He Li;Liu Zhening;Zhong Runhao(College of Intelligent Manufacturing Modern Industry(College of Mechanical Engineering),Xinjiang University,Urumqi 830017,China)

机构地区:[1]新疆大学智能制造现代产业学院(机械工程学院),乌鲁木齐830017

出  处:《国外电子测量技术》2024年第6期59-67,共9页Foreign Electronic Measurement Technology

摘  要:RGB-D显著性目标检测因其有效性和易于捕捉深度线索而受到越来越多的关注。现有的工作通常侧重于通过各种融合策略学习共享表示,少有方法明确考虑如何维持RGB和深度的模态特征。提出了一种跨模态特征融合网络,该网络维持RGB-D显著目标检测的RGB和深度的模态,通过探索共享信息以及RGB和深度模态的特性来提高显著检测性能。具体来说,采用RGB模态、深度模态网络和一个共享学习网络来生成RGB和深度模态显著性预测图以及共享显著性预测图。提出了一种跨模态特征融合模块,用于融合共享学习网络中的跨模态特征,然后将这些特征传播到下一层以整合跨层次信息。此外,提出了一种多模态特征聚合模块,将每个单独解码器的模态特定特征整合到共享解码器中,这可以提供丰富的互补多模态信息来提高显著性检测性能。最后,使用跳转连接来组合编码器和解码器层之间的分层特征。通过在4个基准数据集上与7种先进方法进行的实验表明,方法优于其他最先进的方法。RGB-D saliency object detection has received increasing attention due to its effectiveness and ease of capturing depth cues.Existing work usually focuses on learning shared representations through various fusion strategies,and few approaches explicitly consider how to maintain the modal features of RGB and depth.In this paper,we propose a crossmodal fusion network that maintains the modalities of RGB and depth for RGB-D salient object detection,and improves the salient detection performance by exploring the shared information as well as the properties of RGB and depth modalities.Specifically,an RGB modal,a deep modal network,and a shared learning network are used to generate RGB and deep modal saliency prediction maps as well as shared saliency prediction maps.A cross-modal feature integrate module is proposed to fuse cross-modal features in the shared learning network,which are then propagated to the next layer for integrating cross level information.Besides,we propose a multi-modal feature aggregation module to integrate the modality specific features from each individual decoder into the shared decoder,which can provide rich complementary multi-modal information to boost the saliency detection performance.Further,a skip connection is used to combine hierarchical features between the encoder and decoder layers.Experiments with ten state-of-the-art methods on four benchmark datasets show that the method in this paper outperforms other state-of-the-art methods.

关 键 词:RGB-D显著性目标检测 跨模态融合网络 跨模态特征融合 多模态聚合 

分 类 号:TN2[电子电信—物理电子学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象