融合混合注意力机制与多尺度特征增强的高分影像建筑物提取

Building extraction from high-resolution images using a hybrid attention mechanism combined with multi-scale feature enhancement

作　　者：曲海成[1] 梁旭 QU Haicheng;LIANG Xu(College of Software,Liaoning Technical University,Huludao 125105,China)

机构地区：[1]辽宁工程技术大学软件学院,葫芦岛125105

出　　处：《自然资源遥感》2024年第4期107-116,共10页Remote Sensing for Natural Resources

基　　金：国家自然科学基金面上项目“面向数据特性保持的高光谱影像高效压缩方法研究”(编号:42271409);辽宁省高等学校基本科研项目“基于脉冲混合神经网络的高效能目标检测”(编号:LJKMZ20220699)共同资助。

摘　　要：由于复杂背景变换和建筑物形状多样化等因素影响,从高分辨率遥感图像中准确提取建筑物信息面临着挑战。该文提出了一种融合混合注意力机制与多尺度特征增强的高分辨率建筑物语义分割网络(building mining net,BMNet)。首先,编码器部分使用VGG-16作为主干网络来提取特征,得到4层特征表示;然后设计解码器用于解决多尺度信息中高层特征的细节信息丢失问题,引入了混合通道注意力和空间注意力的串联注意力机制(series attention module,SAM),增强高层特征的表示能力;同时,设计了一种渐进式特征增强的建筑物信息挖掘模块(building mining module,BMM),进一步提高建筑物分割的准确性。BMM把上采样后的特征映射、经过SAM处理的特征映射以及初始预测结果作为输入,获取背景噪声信息,并利用所设计的上下文信息探索模块滤除背景信息,在经过多次BMM处理后得到最佳预测结果。对比实验结果表明:BMNet在武汉大学建筑数据集上精度和交并比分别优于U-net 4.6%和4.8%,在马萨诸塞州建筑数据集和Inria航空图像标注数据集上精度和交并比分别优于U-net 7.9%,8.9%和6.7%,11.0%,验证了所提模型的有效性以及实用性。Accurately extracting building information from high-resolution remote sensing images faces challenges due to complex background transformations and the diversity of building shapes.This study developed a high-resolution building semantic segmentation network-building mining net(BMNet),which integrated a hybrid attention mechanism with multi-scale feature enhancement.First,the encoder utilized VGG-16 as the backbone network to extract features,obtaining four layers of feature representations.Then,a decoder was designed to address the issue of detail loss in high-layer features within multi-scale information.Specifically,a series attention module(SAM),which combined channel attention and spatial attention,was introduced to enhance the representation capabilities of high-layer features.Additionally,the building mining module(BMM)with progressive feature enhancement was designed to further improve the accuracy of building segmentation.With the upsampled feature mapping,the feature mapping post-processed using SAM,and initial prediction results as input,the BMM output background noise information and then filtered out background information using the context information exploration module designed in this study.Optimal prediction results were achieved after multiple processing using the BMM.Comparative experiment results indicate that the BMNet outperformed U-Net,with accuracy and intersection over union(IoU)increasing by 4.6%and 4.8%,respectively on the WHU Building dataset,by 7.9%and 8.9%,respectively on the Massachusetts buildings dataset,and by 6.7%and 11.0%,respectively on the Inria Aerial Image Labeling Dataset.These results validate the effectiveness and practicality of the proposed model.

关键词：语义分割高分辨率遥感影像建筑物提取 U-net 注意力机制空洞卷积

分类号：TP751[自动化与计算机技术—检测技术与自动化装置]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

融合混合注意力机制与多尺度特征增强的高分影像建筑物提取

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

融合混合注意力机制与多尺度特征增强的高分影像建筑物提取

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索