融合多维特征的街景图像语义分割方法

Semantic segmentation method for street images with multi-dimensional features

作　　者：朱磊[1] 车晨洁姚同钰潘杨[1] 张博[1] ZHU Lei;CHE Chenjie;YAO Tongyu;PAN Yang;ZHANG Bo(College of Electronic Information,Xi'an Polytechnic University,Xi'an 710600,China)

机构地区：[1]西安工程大学电子信息学院,陕西西安710600

出　　处：《液晶与显示》2024年第7期980-989,共10页Chinese Journal of Liquid Crystals and Displays

基　　金：国家自然科学基金(No.61971339);陕西省重点研发计划(No.2019GY-113);陕西省自然科学基础研究计划(No.2019JQ-361)。

摘　　要：为进一步提升深度学习语义分割方法在复杂街景图像上的分割精度,本文基于PointRend网络提出了一种融合多维特征(Multi-Dimensional Features,MDF)的街景图像语义分割网络(MDFNet)。首先,通过构建目标区域增强模块优化特征提取子网络,在深度网络的每个卷积块自适应地细化中间特征图,从而强化对复杂街景图像多维特征信息的精细提取;接着,在特征融合时引入特征金字塔网格,使用不同的卷积核处理不同尺度的街景图像,从而更加全面地获取复杂街景图像各类目标的不同分辨率特征;最后,使用双解码头对图像细节进行更细致的恢复,得到逐像素分类的结果。实验结果表明,本文网络与DeepLabV3、SegFormer等其他优秀分割网络相比,在Cityscapes复杂街景数据集上分割精度更高,平均交并比达到了80.11%,相比于其他网络提升了3.51%以上,对复杂街景图像的理解力更强。To further enhance the segmentation accuracy of deep learning semantic segmentation method on complex street images,this paper proposes a semantic segmentation network(MDFNet)incorporating multi-dimensional features based on PointRend network of street image.Firstly,the algorithm builds a target area enhancement module to optimize the feature extraction sub-network,which self-adaptively refines the intermediate feature map in each convolutional block of the deep network.Thus,the module enhances the fine extraction of multi-dimensional feature information of complex street images.Secondly,the paper introduces feature pyramid grid during feature fusion.The module uses different convolutional kernels to process street images of different scales.Thus,it obtains more comprehensively the different resolution features of various targets in complex street images.Finally,we use the double decoder to recover the details of the image in more detail to obtain the pixel-by-pixel classification results.The experimental results show that the network in this paper has higher segmentation accuracy on the Cityscapes dataset compared with other excellent networks such as DeepLabV3 and SegFormer.The mean intersection over union reaches 80.11%and an improvement of more than 3.51%compared to other networks.The method provides better understanding of images of complex street scenes.

关键词：语义分割目标区域增强注意力机制特征金字塔网格多维特征

分类号：TP391.4[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

融合多维特征的街景图像语义分割方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

融合多维特征的街景图像语义分割方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索