基于金字塔视觉Transformer的遥感图像显著性目标检测

Salient object detection in remote sensing images based on pyramid vision Transformer

作　　者：王杨范祥祥于得水张英吴昊冉张士坤 WANG Yang;FAN Xiangxiang;YU Deshui;ZHANG Ying;WU Haoran;ZHANG Shikun(School of Big Data and Artificial Intelligence,Wuhu University,Wuhu 241008,China;School of Computer and Information,Anhui Normal University,Wuhu 241002,China)

机构地区：[1]芜湖学院大数据与人工智能系,安徽芜湖241008 [2]安徽师范大学计算机与信息学院,安徽芜湖241002

出　　处：《安徽科技学院学报》2025年第2期49-58,共10页Journal of Anhui Science and Technology University

基　　金：安徽省高校自然科学研究项目(2022AH052899);安徽师范大学皖江学院科学研究项目(WHKYZD-202402);安徽师范大学皖江学院教学质量工程项目(WHJXTD-202402)。

摘　　要：针对光学遥感图像多尺度目标和复杂背景导致显著性目标检测效果不佳的问题,提出一种基于金字塔视觉Transformer的ADNet模型。首先,引入随机加权空间注意力模块,提高ADNet对目标区域的关注度,减少背景干扰。其次,设计了一个动态语义融合模块,可以实时理解和融合图像中的语义信息,帮助ADNet更精准地进行图像分割。然后,还设计了一个自适应边缘增强模块,用于提取和增强边缘信息,确保边缘分割更加精准。最后,通过显著性特征解码器对这3个核心模块的结果进行融合,生成最终显著性图。在公共数据集EORSSD和自制数据集FORSSD上均表现出较好的结果。其中,在FORSSD数据集上表现效果最好:F-measure、E-measure、S-measure和MAE指标达到了0.9180、0.9834、0.9316和0.0065。通过金字塔视觉Transformer和这3个模块的协同工作,有效应对了光学遥感图像中多尺度目标和复杂背景的问题,提高了光学遥感图像显著性目标的检测效果。To address the challenges posed by multi-scale targets and complex backgrounds in optical remote sensing images,which often lead to poor performance in saliency object detection,this paper proposes an ADNet model based on the Pyramid Vision Transformer.Firstly,a Shuffle Weighted Spatial Attention Module is introduced to enhance ADNet's focus on target regions while reducing background interference.Secondly,a Dynamic Semantic Fusion Module is designed to facilitate real-time understanding and integration of semantic information in the image,enabling more precise segmentation by ADNet.Thirdly,an Adaptive Edge Enhancement Module is implemented to extract and enhance edge information,ensuring more accurate edge segmentation.Finally,a Saliency Feature Decoder is employed to integrate the output features of these three core modules,producing the final saliency map.Extensive experiments on the public EORSSD dataset and self-constructed FORSSD dataset demonstrate the effectiveness of the proposed model.Notably,ADNet achieves superior performance on the FORSSD dataset,with F-measure,E-measure,S-measure,and MAE scores reaching 0.9180,0.9834,0.9316,0.0065,respectively.By leveraging the Pyramid Vision Transformer and the collaborative contributions of the three modules,ADNet effectively addresses the challenges of multi-scale objects and complex backgrounds in optical remote sensing images,significantly improving the detection performance of salient objects.

关键词：显著性目标检测(SOD) 光学遥感图像(ORSI) 注意力机制边缘增强语义信息

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于金字塔视觉Transformer的遥感图像显著性目标检测

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于金字塔视觉Transformer的遥感图像显著性目标检测

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索