融合编码器多尺度特征的RGB-D图像语义分割  

RGB-D Image Semantic Segmentation using Multi-scale Encoder Features Fusion

在线阅读下载全文

作  者:杨晓文[1,2] 靳瑜昕 韩慧妍 况立群 无[3] YANG Xiao-wen;JIN Yu-xin;HAN Hui-yan;KUANG Li-qun;无(School of Computer Science and Technology,North University of China,Taiyuan Shanxi 030051,China;Shanxi Province's Vision Information Processing and Intelligent Robot Engineering Research,Taiyuan Shanxi 030051,China;Shanxi Key Laboratory of Machine Vision and Virtual Reality,Taiyuan Shanxi 030051,China)

机构地区:[1]中北大学计算机科学与技术学院,山西太原030051 [2]山西省视觉信息处理及智能机器人工程研究中心,山西太原030051 [3]机器视觉与虚拟现实山西省重点实验室,山西太原030051

出  处:《计算机仿真》2024年第9期205-212,227,共9页Computer Simulation

基  金:国家自然科学基金(62272426);山西省回国留学人员科研资助项目(2020-113);山西省科技成果转化引导专项(202104021301055)。

摘  要:针对语义分割任务中,室内场景中目标物体尺寸变化较大的问题,在ACFNet的基础上,提出融合编码器多尺度特征的RGB-D语义分割网络。首先,为有效利用网络提取的多尺度特征,提出结合池化操作的多尺度特征融合模块(PMFM),选择编码器不同阶段RGB和深度特征的融合特征作为该模块的输入;其次,设计改进的跳跃连接模块(ISCM),使用下一层级包含更多语义信息的特征图辅助修正当前层级的特征图,再经跳跃连接以拼接的方式传输到解码器对应阶段。将提出的网络模型应用到NYUD V2和SUN RGB-D数据集上,平均交并比分别达到了52.6%和48.8%。通过这两项改进,实验结果表明,上述方法达到了较高的分割准确率,优于对比的语义分割方法。Aiming at the problem of large size changes of target objects in indoor scenes in semantic segmentation tasks,an RGB-D semantic segmentation network that integrates multi-scale features of encoders is proposed based on ACFNet..To maximize the utilization of the network's multi-scale features,a multi-scale feature fusion module,incorporating a pooling operation(PMFM),is proposed.This module takes fusion features derived from both RGB and depth features at different encoder stages as input.And design a multiple skip connection module(MSCM),use the feature map of the next level containing more scene semantic information to assist in correcting the feature map of the current level.Then transmit it to the corresponding stage of the decoder through skip connection in a concatenated way.The network model presented in this research paper is evaluated on two widely used public datasets:NYUD V2 and SUN RGB-D.Specifically,the mean Intersection-over-Union achieved on the NYUD V2 dataset is 52.6%,while on the SUN RGB-D dataset,it reaches 48.8%.With these two improvements,the experimentation has demonstrated that the method in this paper achieves a high segmentation accuracy,which is superior to ACFNet and the other compared semantic segmentation method.

关 键 词:语义分割 多尺度特征 跳跃连接 深度学习 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象