基于多视图循环神经网络的三维物体识别  被引量:3

Recognition of 3D Object Based on Multi-View Recurrent Neural Networks

在线阅读下载全文

作  者:董帅 李文生[1] 张文强 邹昆[1] DONG Shuai;LI Wen-sheng;ZHANG Wen-qiang;ZOU Kun(Zhongshan Institute,University of Electronic Science and Technology of China Zhongshan Guangdong 528406)

机构地区:[1]电子科技大学中山学院,广东中山528406

出  处:《电子科技大学学报》2020年第2期269-275,共7页Journal of University of Electronic Science and Technology of China

基  金:国家青年科学基金(61502088);广东省自然科学基金(2016A030313018);广东省高等学校优秀青年教师培养计划(Yq2013206)。

摘  要:对于三维物体的识别任务,基于多视图卷积神经网络的方法(MVCNN)在准确性和训练速度等方面都优于基于三维数据表示的方法。但MVCNN依赖于三维模型,且采用了固定视角的视图,不符合实际的应用场景;此外,其视图特征融合采用了最大值池化操作,会损失部分原始特征信息。针对这一问题,该文提出了一种基于多视图循环神经网络(MVRNN)的三维物体识别方法,从3个方面对MVCNN进行改进。首先,在交叉熵损失函数中引入特征辨识度指标,以提高不同物体特征之间的辨识度;其次,使用循环神经网络代替MVCNN的最大值池化操作来融合多个自由视觉视图特征,得到一个更加紧凑且物体外观信息完备的融合特征;最后,利用二分类网络对自由视角单视图特征和融合特征进行匹配,实现三维物体的细粒度识别。为了验证MVRNN的性能,分别在公开数据集ModelNet和自建数据集MV3D上进行对比实验。实验结果表明,与MVCNN相比,MVRNN提取的多视图特征具有更高的辨识度,在两个数据集上的识别准确率均较有明显提升。Multi-view convolutional neural networks(MVCNN)is more accurate and faster than those methods based on state-of-the-art 3D shape descriptors in 3D object recognition tasks.However,the input of MVCNN are views rendered from cameras at fixed positions,which is not the case of most applications.Furthermore,MVCNN uses max-pooling operation to fuse multi-view features and the information of original features may be lost.To address those two problems,a new recognition method of 3D objects based on multi-view recurrent neural networks(MVRNN)is proposed based on MVCNN with improvements on three aspects.First,a new item which is defined as the measure of discrimination is introduced into the cross-entropy loss function to enhance the discrimination of features from different objects.Second,a recurrent neural networks(RNN)is used to fuse multi-view features from free positions into a compact one,instead of the max-pooling operation in MVCNN.RNN can keep the completeness of information about appearance feature.At last,single view feature from free positon is matched with fused features via a bi-classification network to attain fine-grained recognition of 3D objects.Experiments are conducted on the open dataset ModelNet and the private dataset MV3D separately to validate the performance of MVRNN.The results show that MVRNN can exact multi-view features with higher degree of discrimination,and achieve higher accuracy than MVCNN on both datasets.

关 键 词:三维物体 特征提取 特征融合 图像检索 多视图 

分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象