检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:齐晶 胡敏 张京波[3] QI Jing;HU Min;ZHANG Jingbo(DFH Satellite Co.,Ltd.,Beijing 100094,China;School of Aerospace Science and Technology,Space Engineering University,Beijing 102206,China;Beijing Institute of Space Science and Technology Information,Beijing 100094,China)
机构地区:[1]航天东方红卫星有限公司,北京100094 [2]航天工程大学航空宇航科学与技术学院,北京102206 [3]北京空间科技信息研究所,北京100094
出 处:《航天返回与遥感》2023年第4期79-87,共9页Spacecraft Recovery & Remote Sensing
摘 要:传统的基于卷积神经网络的卫星遥感图像场景分类方法忽略了场景图像的全局语义特征以及遥感图像在多个尺度上的鉴别特征。针对此问题,文章在视觉转换器和多尺度特征的基础上,提出了一种基于特征增强型多尺度视觉转换器的遥感图像场景分类方法。该方法采用双分支结构在2个尺度上对遥感图像进行分块,获取到不同大小的图像块,首先利用位置编码和转换器分别对2个尺度下的图像块进行特征学习,再利用通道注意力机制对转换器输出的图像块进行特征增强,最后将2个尺度上学习出的分类标记和增强后的特征进行融合决策,从而实现遥感图像场景分类。采用国际公开的光学遥感图像数据集AID和NWPU-RESISC45进行实验验证,结果表明该方法在AID数据集的场景分类准确率达到(95.27±0.39)%,在NWPU-RESISC45数据集的场景分类准确率达到(92.50±0.14)%,其分类性能优于CaffeNet、VGG、GoogLeNet和ViT等基准方法。该研究成果提升了模型对全局语义和多尺度特征的感知能力,对于提升卫星遥感图像场景分类技术在土地监测、城市规划等方面的应用具有重要意义。Traditional Convolutional Neural Network(CNN)-based methods for scene classification of satellite remote sensing images fail to explore the global semantic features within the scene image and the features at different scales.To address this problem,according to vision transformer(ViT)and multi-scale features,an enhanced multi-scale vision transformer method for scene classification of remote sensing images is proposed in this paper.The two-branch structure is used to divide the entire remote sensing image into patches with different sizes from two scales,and the position encoding and ViT are firstly performed on the patches from at the two scales for feature learning respectively.Then channel attention mechanism is used to enhance the discriminant ability of features generated by patch tokens of ViT.Finally,the class tokens from at the two scales and the enhanced patch features are fused for final scene classification.Experiments on the public optical remote sensing image datasets(AID and NWPU-RESISC45)validate that our method obtains the accuracy of(95.27±0.39)%on AID dataset and the accuracy of(92.50±0.14)%on NWPU-RESISC45 dataset and outperforms other deep learning-based scene classification methods(e.g.CaffeNet,VGG,GoogLeNet and ViT).The researd results improves the awareness capability of model to global semantics and multi-scale features.It is of great importance to satellite remote sensing images scene classification application(e.g.land monitoring and urban planning).
关 键 词:遥感图像 场景分类 深度学习 视觉转换器 多尺度特征 通道注意力
分 类 号:V445[航空宇航科学与技术—飞行器设计]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:52.15.133.37