跨视角地理定位中的三维交互机制  

Triplet Interaction Mechanism in Cross-view Geo-localization

在线阅读下载全文

作  者:周博文 李阳 王家宝 苗壮 张睿 ZHOU Bowen;LI Yang;WANG Jiabao;MIAO Zhuang;ZHANG Rui(College of Command and Control Engineering,Army Engineering University of PLA,Nanjing 210007,China)

机构地区:[1]陆军工程大学指挥控制工程学院,南京210007

出  处:《计算机科学》2025年第3期86-94,共9页Computer Science

基  金:江苏省自然科学基金(BK20200581)。

摘  要:跨视角地理定位是一种图像检索任务,其目的是在不同视角下使用无地理坐标的图像与数据库中有地理坐标的图像进行检索匹配,从而获取目标图像的地理位置信息。然而,现有方法大多忽略了全局位置信息和特征完整性,导致模型无法捕获深层语义信息;另外,现有的二维交互方式未充分利用维度间关系,导致跨维交互不充分。为解决上述问题,设计了一种跨视角地理定位三维交互机制。该方法利用ConvNeXt作为特征提取网络,随后使用所提出的三维交互机制(Triplet Interaction Mechanism,TIM)进行特征丰富操作,最后利用联合损失函数指导模型训练。所提方法在模型内进行了多次三维交互,缓解了二维特征投影部分信息丢失的问题。同时,所提出的三维交互机制在3个通道中使用不同的注意力,使模型对跨视角图像的平移、缩放、旋转具有鲁棒性。实验结果表明,所提方法在University-1652数据集上针对无人机视角定位和无人机导航两个任务均取得了最优性能。Cross-view geo-localization refers to inferring the geographical location from images of different viewpoints,which is usually viewed as an image retrieval task.However,most existing methods neglect the global position information and feature completeness,which makes the model can not conducive to capturing deep semantic information.Additionally,the current two-dimensional interaction methods do not fully utilize the relationships between dimensions,leading to insufficient cross-dimensional interaction.To address these issues,this paper designs a triplet interaction mechanism for cross-view geo-localization.This method uses ConvNeXt as the feature extraction network,followed by a proposed triplet interaction mechanism,for feature enrichment operations.Finally,a joint loss function is utilized to guide model training.It performs multiple dimensional interactions within the model,reducing the problem of information loss in the two-dimensional feature projection.The proposed method includes a triplet interaction mechanism that uses different attention mechanisms in three channels,making the model robust to translations,scaling,and rotations for different cross-view images.Experimental results demonstrate that the proposed method can significantly outperforms other methods for both drone view localization and drone navigation tasks on University-1652 dataset.

关 键 词:跨视角 地理定位 交互机制 特征注意力 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象