机构地区:[1]中航(成都)无人机系统股份有限公司,成都611730 [2]电子科技大学资源与环境学院,成都611731 [3]四川旅游学院信息与工程学院,成都610100
出 处:《时空信息学报》2025年第1期62-72,共11页JOURNAL OF SPATIO-TEMPORAL INFORMATION
基 金:四川省科技计划项目(2023YFG0028,2024YFG0002);四川省转移支付应用研发项目(R22ZYZF0004)。
摘 要:在以无人机图像地理定位为目的跨视角图–图检索领域,Transformer还处于起步阶段,当前还缺少结合多尺度信息的高稳健性方法。本文提出一种Transformer区块匹配的地理定位(Transformer oblong matching for geo-localization,TomGeo)方法,用于无人机图像地理定位和基于图像区域导航的跨视角图–图检索。首先采用PVT(pyramid vision Transformer)作为特征编码器提取图像多尺度特征;其次基于图块特征进行区块分类、区块匹配,完成同一地点不同视角图像间相同实例区域的对应;最后通过显著区识别,增强跨视角图像中关键实例类别信息;并基于公开数据集University-1652,与已有方法进行精度评价。结果显示:①TomGeo检索无人机视角图像对应地的点卫星视角图像时,检索结果中召回率(recall@1,R@1)达到85.54%,平均精确率(average precision,AP)达到87.62%;检索卫星视角图像对应地点的无人机视角图像时,R@1达到91.43%,AP达到85.87%。②相较于已有方法,各项评价指标均具优势。研究成果可为无人机在特殊情况下的使用和低空经济发展提供技术支撑。[Objective]With the diversification and cost-effective development of remote sensing platforms,the availability of remote sensing data has significantly increased.The current challenge lies in effectively managing this multi-source,massive,and variable data.unmanned aerial vehicle(UAV),as one of the most convenient platforms for remote sensing data acquisition,have rapidly developed and gained widespread use in recent years.The data collected by UAV has proven invaluable for both civil and military applications.Geo-localization of UAV imagery is a key step in many applications.However,geo-localization becomes a significant challenge when the global navigation satellite system(GNSS)is unavailable or performs poorly due to external influences.[Method]Based on the block matching of Transformer,we proposed a cross-view image-image retrieval method named TomGeo(Transformer oblong matching for geo-localization).This method can be used for geo-localization of UAV images and image-based navigation.TomGeo uses pyramid vision Transformer(PVT)as a feature encoder to extract multi-scale features from UAV and satellite images.Block classification and block matching are then performed based on these features to establish correspondence between the same regions in images from different viewpoints at the same location.Finally,salient area identification is used to enhance key instance category information in the cross-view images.[Result]TomGeo has implemented multi-scale feature fusion based on PVT,which further improves the shortcomings of low utilization of location differences and contextual information of key features in cross view image retrieval through block classification,block matching,and salient region recognition.On the publicly available dataset University-1652,when retrieving satellite perspective images of corresponding locations based on UAV perspective images,TomGeo’s R@1 was 85.54%and AP was 87.62%;When retrieving UAV perspective images of corresponding locations based on satellite perspective images,R@1 is
关 键 词:跨视角 无人机 地理定位 图–图检索 TRANSFORMER 区块匹配 多尺度特征
分 类 号:P23[天文地球—摄影测量与遥感]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...