基于多任务模型的深度预测算法研究  被引量:1

Research on depth prediction algorithm based on multi-task model

在线阅读下载全文

作  者:姚翰 殷雪峰 李童 张肇轩 杨鑫[1] 尹宝才 YAO Han;YIN Xue-feng;LI Tong;ZHANG Zhao-xuan;YANG Xin;YIN Bao-cai(School of Computer Science and Technology,Dalian University of Technology,Dalian Liaoning 116024,China)

机构地区:[1]大连理工大学计算机科学与技术学院,辽宁大连116024

出  处:《图学学报》2021年第3期446-453,共8页Journal of Graphics

基  金:国家自然科学基金项目(91748104,61972067,61632006,U1811463,U1908214,61751203);国家重点研发计划项目(2018AAA0102003)。

摘  要:图像的深度值预测是计算机视觉和机器人领域中的一个热门的研究课题。深度图的构建是三维重建的重要前提,传统方法主要依靠确定固定点深度进行人工标注或是根据相机的位置变化来进行双目定位预测深度,但这类方法一方面费时费力,另一方面也受到相机位置、定位方式、分布概率性等因素的限制,准确率很难得到保证,从而导致预测的深度图难以完成后续三维重建等工作。通过引入基于多任务模块的深度学习方法,可以有效解决这一问题。针对场景图像提出一种基于多任务模型的单目图像深度预测网络,能同时训练学习深度预测、语义分割、表面向量估计3个任务,包括共有特征提取模块和多任务特征融合模块,能在提取共有特征的同时保证各个特征的独立性,提升各个任务的结构性的同时保证深度预测的准确性。Image depth prediction is a hot research topic in the field of computer vision and robotics.The construction of depth image is an important prerequisite for 3D reconstruction.Traditional methods mainly conduct manual annotation based on the depth of a fixed point,or predict the depth based on binocular positioning according to the position of the camera.However,such methods are time-consuming and labor-intensive and restricted by factors such as camera position,positioning method,and distribution probability.As a result,the difficulty in guaranteeing high accuracy poses a challenge to subsequent tasks following the predicted depth map,such as 3D reconstruction.This problem can be effectively solved by introducing a deep learning method based on multi-task modules.For scene images,a multi-task model-based monocular-image depth-prediction network was proposed,which can simultaneously train and learn three tasks of depth prediction,semantic segmentation,and surface vector estimation.The network includes a common feature extraction module and a multi-task feature fusion module,which can ensure the independence of each feature while extracting common features,and guarantee the accuracy of depth prediction while improving the structure of each task.

关 键 词:计算机视觉 单目深度预测 多任务模型 语义分割 表面向量估计 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象