面向RGB-D场景解析的三维空间结构化编码深度网络被引量：1

Three-dimensional spatial structured encoding deep network for RGB-D scene parsing

机构地区：[1]哈尔滨工程大学计算机科学与技术学院,哈尔滨150001 [2]西北工业大学航空学院,西安710072

出　　处：《计算机应用》2017年第12期3458-3466,共9页journal of Computer Applications

基　　金：国家重点研发计划项目(2016YFB1000400);国家自然科学基金资助项目(60903098);中央高校自由探索基金资助项目(HEUCF100606)~~

摘　　要：有效的RGB-D图像特征提取和准确的3D空间结构化学习是提升RGB-D场景解析结果的关键。目前,全卷积神经网络(FCNN)具有强大的特征提取能力,但是,该网络无法充分地学习3D空间结构化信息。为此,提出了一种新颖的三维空间结构化编码深度网络,内嵌的结构化学习层有机地结合了图模型网络和空间结构化编码算法。该算法能够比较准确地学习和描述物体所处3D空间的物体分布。通过该深度网络,不仅能够提取包含多层形状和深度信息的分层视觉特征(HVF)和分层深度特征(HDF),而且可以生成包含3D结构化信息的空间关系特征,进而得到融合上述3类特征的混合特征,从而能够更准确地表达RGB-D图像的语义信息。实验结果表明,在NYUDv2和SUNRGBD标准RGB-D数据集上,该深度网络较现有先进的场景解析方法能够显著提升RGB-D场景解析的结果。Efficient feature extraction from RGB-D images and accurate 3D spatial structure learning are two key points for improving the performance of RGB-D scene parsing. Recently, Fully Convolutional Neural Network （FCNN） has powerful ability of feature extraction, however, FCNN can not learn 3D spatial structure information sufficiently. In order to solve the problem, a new neural network architecture called Three-dimensional Spatial Structured Encoding Deep Network （3D-SSEDN） was proposed. The graphical model network and spatial structured encoding algorithm were organically combined by the embedded structural learning layer, the 3D spatial distribution of objects could be precisely learned and described. Through the proposed 3D-SSEDN, not only the Hierarchical Visual Feature （HVF） and Hierarchical Depth Feature （HDF） eontaining hierarchical shape and depth information could be extracted, but also the spatial structure feature containing 3D structural information could be generated. Furthermore, the hybrid feature could be obtained by fusing the above three kinds of features, thus the semantic information of RGB-D images could be accurately expressed. The experimental results on the standard RGB- D datasets of NYUDv2 and SUNRGBD show that, compared with the most previous state-of-the-art scene parsing methods, the proposed 3D-SSEDN can significantly improve the performance of RGB-D scene parsing.

关键词：全卷积神经网络图模型空间结构化编码算法分层视觉特征分层深度特征空间关系特征混合特征

分类号：TP391.413[自动化与计算机技术—计算机应用技术] TP18[自动化与计算机技术—计算机科学与技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

面向RGB-D场景解析的三维空间结构化编码深度网络被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

面向RGB-D场景解析的三维空间结构化编码深度网络 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

面向RGB-D场景解析的三维空间结构化编码深度网络被引量：1