注意力置换与通道重建的无人机城市街景实时语义分割

Real-time semantic segmentation of UAV urban street scenes with attention permutation and channel reconstruction

作　　者：柳长源郭鹏岗兰朝凤 LIU Chang-yuan;GUO Peng-gang;LAN Chao-feng(College of Measurement and Control Technology and Communication Engineering,Harbin University of Science and Technology,Harbin 150080,China)

机构地区：[1]哈尔滨理工大学测控技术与通信工程学院,哈尔滨150080

出　　处：《控制与决策》2025年第4期1198-1206,共9页Control and Decision

基　　金：国家自然科学基金项目(11804068);黑龙江省交通运输厅科技项目(HJK2024B002)。

摘　　要：针对无人机城市街景实时语义分割任务中轻量级算法缺乏全局信息交互导致像素类别错分的问题,提出一种注意力置换与通道重建的无人机城市街景实时语义分割网络,网络采用编码-解码结构.在编码器中,利用轻量级的置换自注意力机制来构建注意力分支,提取全局上下文信息的同时保持较高的计算效率;利用分裂-变换-融合的策略设计通道重建模块对注意力分支的输入进行融合压缩,减小无关特征带来的计算量和对分割结果的影响.在解码器阶段,利用空间权重加权构建空间特征融合模块,实现对有效特征最大程度上的利用;利用置换自注意力机制和非对称卷积构建全局信息感知模块,以克服无人机航拍图像中复杂背景的干扰.实验结果表明:所提模型在UAVid验证集上平均交并比达到72.3%,相较于UNetFormer提升了2.3%,分割速度达到每秒105.8帧;在保证模型分割速度的前提下,取得了较好的分割精度.In response to the issue of misclassification of pixel categories caused by the lack of global information interaction in lightweight algorithms for real-time semantic segmentation of urban street scenes by drones,a real-time semantic segmentation of UAV urban street scenes with attention permutation and channel reconstruction is proposed,adopting an encoder-decoder structure.In the encoder,a lightweight permutation self-attention mechanism is utilized to construct an attention branch,extracting global context information while maintaining high computational efficiency.By employing the split-transform-merge strategy,a channel reconstruction module is designed to fuse and compress the input of the attention branch,reducing the computational complexity caused by irrelevant features and their impact on segmentation results.In the decoder stage,a spatial feature fusion block is constructed using spatially weighted fusion,maximizing the utilization of effective features.Moreover,a permutation self-attention mechanism and asymmetric convolution are utilized to construct a global information perception block to overcome the interference of complex backgrounds in UAV aerial images.Experimental results show that the proposed model achieves a mean intersection over union of 72.3%on the UAVid validation set,which is 2.3%improvement compared to UNetFormer,with the segmentation speed of 105.8 frames per second.It achieves good segmentation accuracy while ensuring model segmentation speed.

关键词：实时语义分割无人机航拍置换自注意力通道重建空间特征融合全局信息感知

分类号：TP391.4[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

注意力置换与通道重建的无人机城市街景实时语义分割

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

注意力置换与通道重建的无人机城市街景实时语义分割

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索