检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:Zhihuang ZHANG Meng XU Wenqiang ZHOU Tao PENG Liang LI Stefan POSLAD
机构地区:[1]School of Information Technology&Management,University of International Business and Economics,Beijing 100029,China [2]School of Vehicle and Mobility,Tsinghua University,Beijing 100084,China [3]Qcraft Inc.,Beijing 100054,China [4]School of Electronic Engineering and Computer Science,Queen Mary University of London,London E14NS,UK
出 处:《Science China(Information Sciences)》2025年第2期130-146,共17页中国科学(信息科学)(英文版)
基 金:supported by Beijing Higher Education Society under the 2024 General Project Scheme(Grant No.MS2024128);funding from the Ningbo Philosophy and Social Science Planning Project,as part of the“Ningbo Development Blue Book 2025”Initiative(Grant No.GL24-16)。
摘 要:Accurate localization ability is fundamental in autonomous driving.Traditional visual localization frameworks approach the semantic map-matching problem with geometric models,which rely on complex parameter tuning and thus hinder large-scale deployment.In this paper,we propose BEV-Locator:an end-to-end visual semantic localization neural network using multi-view camera images.Specifically,a visual BEV(bird-eye-view)encoder extracts and flattens the multi-view images into BEV space.While the semantic map features are structurally embedded as map query sequences.Then a cross-model transformer associates the BEV features and semantic map queries.The localization information of ego-car is recursively queried out by cross-attention modules.Finally,the ego pose can be inferred by decoding the transformer outputs.This end-to-end model speaks to its broad applicability across different driving environments,including high-speed scenarios.We evaluate the proposed method in large-scale nuScenes and Qcraft datasets.The experimental results show that the BEV-Locator is capable of estimating the vehicle poses under versatile scenarios,which effectively associates the cross-model information from multi-view images and global semantic maps.The experiments report satisfactory accuracy with mean absolute errors of 0.052 m,0.135 m and 0.251°in lateral,longitudinal translation and heading angle degree.
关 键 词:visual localization semantic map bird-eye-view TRANSFORMER pose estimation
分 类 号:TP391.41[自动化与计算机技术—计算机应用技术] U463.6[自动化与计算机技术—计算机科学与技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.3