连续环境中基于语义拓扑图的视觉语言导航推理  

Semantic Topological Maps-Based Reasoning for Vision-and-Language Navigation in Continuous Environments

在线阅读下载全文

作  者:谢子龙 许明 XIE Zilong;XU Ming(Software College,Liaoning Technical University,Huludao 125105)

机构地区:[1]辽宁工程技术大学软件学院,葫芦岛125105

出  处:《模式识别与人工智能》2024年第9期839-849,共11页Pattern Recognition and Artificial Intelligence

基  金:辽宁工程技术大学博士科研基金项目(No.21-1027)资助。

摘  要:针对现有视觉语言导航方法在连续环境中推理能力不足的问题,提出基于语义拓扑图的视觉语言导航推理模型.首先,通过场景理解辅助任务识别导航环境中的区域和物体,构建空间邻近知识库.然后,智能体在导航过程中与环境实时交互,收集位置信息,编码视觉特征,并预测区域和物体的语义标签,逐步生成语义拓扑图.在此基础上,提出辅助推理定位策略,利用自注意力机制,从导航指令中提取物体信息和区域信息,并结合空间邻近知识库和语义拓扑图,对物体和区域进行推理定位,以辅助导航决策,确保智能体的导航轨迹与指令对齐.最后,在公开数据集R2R-CE和RxR-CE上的实验表明,文中模型的导航成功率较高.To address the issue of inadequate reasoning ability of existing vision-language navigation methods in continuous environments,a method for semantic topological maps-based reasoning for vision-and-language navigation in continuous environments is proposed.First,regions and objects in the navigation environment are identified through scene understanding auxiliary tasks,and a knowledge base of spatial proximity is constructed.Second,the agent interacts with the environment in real time during the navigation process,collecting location information,encoding visual features and predicting semantic labels of regions and objects.Thereby a semantic topological map is gradually generated.On this basis,an auxiliary reasoning localization strategy is designed.A self-attention mechanism is employed to extract object and region information from navigation instructions,and the spatial proximity knowledge base is combined with semantic topological map to infer and localize objects and regions.The above assists navigation decisions and ensures that the agent navigation trajectory aligns with the instructions.Experimental results on public datasets R2R-CE and RxR-CE demonstrate the proposed method achieves a higher navigation success rate.

关 键 词:视觉语言导航 视觉推理 多模态数据 具身智能 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象