检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:刘异[1] 张寅捷 敖洋 江大龙 张肇睿 LIU Yi;ZHANG Yinjie;AO Yang;JIANG Dalong;ZHANG Zhaorui(School of Geodesy and Geomatics,Wuhan University,Wuhan 430079,China)
机构地区:[1]武汉大学测绘学院,武汉430079
出 处:《遥感学报》2024年第12期3173-3183,共11页NATIONAL REMOTE SENSING BULLETIN
基 金:国家自然科学基金(编号:62071341)。
摘 要:建筑物是城市中最为普遍的基础设施,获取遥感影像中的建筑区域对于城市规划、人口估计和灾情分析等具有重要的意义。本文基于Transformer结构,设计了一种端到端的遥感影像建筑区域提取方法。首先,针对多尺度影像特征存在的信息冗余和信息差异问题,本文提出了一种多次特征金字塔结构Tri-FPN(Triple-Feature Pyramid Network),实现跨越近邻尺度的全局多尺度信息融合,提高多尺度特征的类别表征一致性并减少信息的冗余。其次,针对多尺度提取结果融合时仅考虑尺度因素的问题,本文设计了一种顾及“尺度—类别—空间”的注意力模块CSA-Module(Class-Scale Attention Module),有效融合了不同尺度下的建筑提取结果。最后,在Transformer结构上加入Tri-FPN与CSA-Module进行模型训练,获得最佳的建筑提取效果。实验对比分析表明,本文的方法有效提高了建筑区域的检出率,并提供出更为准确的建筑轮廓,提升了遥感影像中建筑的提取精度,在WHU Building数据集和INRIA数据集上分别取得了91.53%和81.7%的IOU分数。As deep learning develops,researchers are paying increasing attention to its application in remote sensing building extraction.Many experiments on multiscale feature fusion,which boosts performance during the feature inference stage,and multiscale output fusion have been conducted to achieve a trade-off between accuracy and efficiency and obtain enhanced details and overall effects.However,current multiscale feature fusion methods consider only the nearest feature,which is insufficient for cross-scale feature fusion.The functions of multiscale output fusion are also limited in a unary correlation,which only considers the scale element.To address these problems,we propose a feature fusion method and a result fusion module to improve the accuracy of building extraction from remote sensing images.This study proposes the Triple-Feature Pyramid Network(Tri-FPN)and Class-Scale Attention Module(CSA-Module)based on Segformer to extract buildings in remote sensing images.The whole network structure is divided into three components:feature extraction,feature fusion,and classification head.In the feature extraction component,the Segformer structure is adopted to extract multiscale features.Segformer utilizes the self-attention function to extract feature maps of different scales.To adaptively enlarge the receptive fields,Segformer uses a strided convolution kernel to shrink the key and value vector in the self-attention computation process.The calculation cost decreases considerably.The goal of the feature fusion component is to fuse multiscale features from different parts of the feature extraction network.Tri-FPN consists of three feature pyramid networks.The fusion follows the sequence top-down,bottom-up,and top-down,thus enlarging the scale-receptive field.The basic fusion blocks are 3×3 convolution with feature element-wise addition and 1×1 convolution with channel concatenation.This design helps maintain the spatial diversity and inner-class feature consistency.In the classification head component,each pixel is assi
关 键 词:遥感影像 建筑提取 深度学习 TRANSFORMER 影像特征金字塔 类别尺度注意力
分 类 号:P237[天文地球—摄影测量与遥感] P2[天文地球—测绘科学与技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222