基于多级特征图联合上采样的实时语义分割  被引量:2

Real-time Semantic Segmentation Based on Multi-scale Feature Map Joint Pyramid Upsamping

在线阅读下载全文

作  者:宋宇[1] 王小瑀 梁超[1] 程超[1] SONG Yu;WANG Xiao-yu;LIANG Chao;CHENG Chao(School of Computer Science and Engineering,Changchun University of Technology,Changchun 130012,China)

机构地区:[1]长春工业大学计算机科学与工程学院,吉林长春130012

出  处:《计算机技术与发展》2022年第2期82-87,共6页Computer Technology and Development

基  金:吉林省科技发展计划技术攻关项目(20200401127GX);吉林省科技发展计划重点研发项目(20200403037SF);吉林省发改委项目(2019C040-3)。

摘  要:视觉感知是无人驾驶技术中的重要一环,而语义分割技术又是实现视觉感知的主要技术手段之一。现在的语义分割技术多采用计算量大、内存占用高的空洞卷积来提取高分辨率特征图,从而导致现在主流的语义分割网络分割速度不足,无法有效应用于无人驾驶的场景中。针对这一问题,提出了一种实时性更好的语义分割网络。首先,采用了一种轻量级的卷积神经网络作为编码器,并且使用跨步卷积和常规卷积替换了耗时、耗内存的空洞卷积。然后,为了得到与DeepLabv v3+相似的特征图,提出了一种新的联合上采样模块:多级特征图联合上采样模块(multi-scale feature map joint pyramid upsamping, MJPU),通过融合编码器的多个特征图,生成了语义信息更加丰富的高分辨率特征图。通过Cityscapes数据集上的实验表明,相比于主流语义分割网络Deeplabv3+,该网络在不损失大量性能的前提下,可以将分割速度提高2.25倍,达到32.3 FPS/s。从而使网络具有更好的实时性,更加适合应用于无人驾驶场景。Vision-based perception is an import link in driverless technology, and semantic segmentation is one of the main technique to realize visual perception in driverless technology. The current semantic segmentation technology mostly uses atrous convolution with a large amount of computation and high memory consumption to extract high-resolution feature maps. As a result, the current mainstream semantic segmentation network lacks the segmentation speed and cannot be effectively applied in driverless technology. To solve this problem, a semantic segmentation network with better real-time performance is proposed. Firstly, a lightweight convolutional neural network is used as the encoder, and stride convolution and regular convolution are used to replace the time-consuming and memory-consuming atrous convolution. Secondly, in order to obtain the feature map similar to Deeplabv V3 +,a new joint upsampling module, multi-scale feature map joint pyramid upsamping, is proposed. By fusing multiple feature maps in the encoder, a high resolution feature map with richer semantic information is generated. Experiments on the Cityscapes dataset show that compared with the popular semantic segmentation network Deeplab V3+,the proposed network can improve the segmentation speed by 2.25 times to 32.3 FPS/s without losing a lot of performance. Therefore, the network proposed has better real-time performance and is more suitable for driverless scenes.

关 键 词:无人驾驶 语义分割 卷积神经网络 深度学习 空洞卷积 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象