面向室内场景的主被动融合视觉定位系统  被引量:1

Visual localization system of integrated active and passive perception for indoor scenes

在线阅读下载全文

作  者:谢挺 张晓杰 叶智超 王子豪 王政 张涌 周晓巍 姬晓鹏 Xie Ting;Zhang Xiaojie;Ye Zhichao;Wang Zihao;Wang Zheng;Zhang Yong;Zhou Xiaowei;Ji Xiaopeng(State Key Laboratory of CAD&CG,Zhejiang University,Hangzhou 310058,China;System Engineering Research Institute,Beijing 100094,China;China Information Technology Designing&Consulting Institute,Beijing 100048,China)

机构地区:[1]浙江大学CAD&CG国家重点实验室,杭州310058 [2]中国船舶工业系统工程研究院,北京100094 [3]中讯邮电咨询设计院有限公司,北京100048

出  处:《中国图象图形学报》2023年第2期522-534,共13页Journal of Image and Graphics

基  金:科技创新2030-“新一代人工智能”重大项目课题(2020AAA0108901)。

摘  要:目的视觉定位旨在利用易于获取的RGB图像对运动物体进行目标定位及姿态估计。室内场景中普遍存在的物体遮挡、弱纹理区域等干扰极易造成目标关键点的错误估计,严重影响了视觉定位的精度。针对这一问题,本文提出一种主被动融合的室内定位系统,结合固定视角和移动视角的方案优势,实现室内场景中运动目标的精准定位。方法提出一种基于平面先验的物体位姿估计方法,在关键点检测的单目定位框架基础上,使用平面约束进行3自由度姿态优化,提升固定视角下室内平面中运动目标的定位稳定性。基于无损卡尔曼滤波算法设计了一套数据融合定位系统,将从固定视角得到的被动式定位结果与从移动视角得到的主动式定位结果进行融合,提升了运动目标的位姿估计结果的可靠性。结果本文提出的主被动融合室内视觉定位系统在iGibson仿真数据集上的平均定位精度为2~3 cm,定位误差在10 cm内的准确率为99%;在真实场景中平均定位精度为3~4 cm,定位误差在10 cm内的准确率在90%以上,实现了cm级的定位精度。结论提出的室内视觉定位系统融合了被动式和主动式定位方法的优势,能够以较低设备成本实现室内场景中高精度的目标定位结果,并在遮挡、目标丢失等复杂环境因素干扰下展示出鲁棒的定位性能。Objective Visual localization is focused on the location and estimation of motion objects via easy-to-use RGB images.The feature-extracted information is challenged to meet the requirements of tasks in traditional computer vision methods in terms of feature extraction algorithms.The deep learning-based feature abstraction and demonstration ability can promote an emerging research issue for pose estimation in computer vision.In addition,the development and application of depth cameras and laser-based sensors can provide more diverse manners to this issue as well.However,these sensors have some constraints of the entity and shape of the object and it need to be used in a structured environment.Multi-vision ability is often challenged to the issues of installing and debugging problems.In contrast,sensors-visual applications are featured of low cost and less restrictions,and they are easy to be recognized and extended for multiple unstructured scenarios.Interferences are being existed in indoor scenes,such as object occlusion and weak texture areas,which can cause the incorrect estimation of the target points easily and affect the accuracy of visual localization severely.The different methods of camera-deployment can be divided into two categories based on visual object pose estimation method.1)In order to get the target position data,one category of the two is based on monocular object positioning of pose estimation technology of using the deployment in cameras-fixed in the scene and detecting targets in the images of the relevant information.The pros of positioning result is stable and the cons of it is affected by light and fuzzy image easily,it cannot be dealt with object occlusion in the scene as well due to the limitation of observation angle;2)The other category of two is oriented on scene reconstruction-based object pose estimation technology,which can use the camera fixed on the target itself to obtain the pose information of the target by detecting the feature points of the scene and matching the features w

关 键 词:视觉定位 数据融合 关键点检测 3维重建 PnP算法 卡尔曼滤波 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象