机构地区:[1]北京物资学院系统科学与统计学院,北京101149 [2]北京物资学院信息学院,北京101149 [3]北京建筑大学机电与车辆工程学院,北京102616
出 处:《光谱学与光谱分析》2025年第5期1485-1493,共9页Spectroscopy and Spectral Analysis
基 金:国家自然科学基金项目(51675494);北京物资学院校级项目(2023XJKY14);北京物资学院系统科学研究院(BWUISS47))资助。
摘 要:高光谱遥感影像丰富的光谱信息,能够为地物分类提供可靠的数据支持。但是,光谱数据高维、冗余,空谱特征联合困难、光谱特征提取不充分等问题对基于深度学习的高光谱遥感影像分类提出了挑战。卷积神经网络(CNN)和Vision Transformer(ViT)是两种在计算机视觉领域中广泛使用的深度学习架构,各自有独特的优势和局限性。CNN擅长捕捉局部特征和空间层次结构,对图像的平移不变性有很好的处理能力。ViT通过自注意力机制能够捕捉图像中的全局依赖关系,对图像的复杂模式有较好的理解能力。为了提升高光谱遥感影像的分类精度,充分发挥CNN和ViT两种模型的优势,结合CNN的局部特征提取能力和ViT的全局上下文理解能力,创新性地将3D EfficientViT模块引入混合卷积,提出了一种联合混合卷积与级联群注意力机制的高光谱遥感影像分类算法EVIT3D_HSN。本算法在三维卷积提取高光谱遥感影像空谱联合特征及二维卷积提取空间特征的基础上引入3D Efficient ViT模块,提高了对不同数据集的泛化能力、更全面地捕捉了高光谱数据的图像特征,从而增强了分类算法的性能,同时并未增加模型复杂度。为了验证本算法的先进性,将本算法EVIT3D_HSN在高光谱遥感影像分类数据集India Pines、Pavia University和Salinas,与算法1DCNN、2DCNN、3DFCN和3DCNN进行对比实验,并于原算法HybridSN进行消融实验。EVIT3D_HSN在以上三种数据集的分类结果为:OA分别为97.66%、99.00%和99.65%,Kappa系数分别为97.3%、98.6%和99.6%。相比于1DCNN,模型分类精度分别提升了37.12%、25.09%和33.67%;相比于2DCNN,精度分别提升了59%、57.43%和46.92%;相比于3DFCN,精度分别提升了45.36%、24.5%和29.72%;相比于3DCNN,精度分别提升了28.05%、14.26%和34.29%;相比于HybridSN,分别提升了3.76%、1.85%和2.57%。此外,除IP数据集的Stone-Steel-Towers,PU数据集的Painted metal sheets和ShadThe rich spectral information of hyperspectral remote sensing images can provide reliable data support for their feature classification.However,the problems of high dimensionality and redundancy of spectral data,difficulty associating spatial and spectral features,and insufficient spectral feature extraction have challenged the classification of hyperspectral remote sensing images based on deep learning.Convolutional neural network(CNN)and Vision Transformer(ViT)are two deep learning architectures widely used in computer vision,and each has unique advantages and limitations.CNN is good at capturing local features and spatial hierarchies and can deal with the invariance of the image's translation.ViT can capture global dependencies and has a better understanding of complex patterns in images.To improve the classification accuracy of hyperspectral remote sensing images and give full play to the advantages of both CNN and ViT models,this paper combines the local feature extraction capability of CNN and the global context understanding capability of ViT,and innovatively introduces the 3D Efficient ViT module into the hybrid convolution,and proposes a hyperspectral remote sensing image classification algorithm combining the hybrid convolution and cascading group attention mechanism EVIT3D_HSN:This algorithm introduces 3D Efficient ViT module based on 3D convolution to extract the joint features of hyperspectral remote sensing images and 2D convolution to extract the spatial features,which improves the generalization ability to different datasets and captures the image features of hyperspectral data in a more comprehensive way,thus enhances the performance of the classification algorithm without increasing the complexity of the model.To validate the advancement of this algorithm,this paper's algorithm EVIT3D_HSN is compared with algorithms 1DCNN,2DCNN,3DFCN,and 3DCNN and the original algorithm HybridSN for ablation experiments on hyperspectral remote sensing imagery classification datasets India Pines,Pavia University,
关 键 词:高光谱遥感影像分类 混合卷积 3D Efficient ViT 级联群注意力
分 类 号:TP751.1[自动化与计算机技术—检测技术与自动化装置]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...