基于蚁群信息素辅助的Q学习路径规划算法被引量：8

Ant colony pheromone aided Q-learning path planning algorithm

作　　者：田晓航霍鑫[1] 周典乐赵辉[1] TIAN Xiao-hang;HUO Xin;ZHOU Dian-le;ZHAO Hui(Control and Simulation Center,Harbin Institute of Technology,Harbin 150080,China;College of Advanced Interdisciplinary Studies,National University of Defense Technology,Changsha 410073,China)

机构地区：[1]哈尔滨工业大学控制与仿真中心,哈尔滨150080 [2]国防科技大学前沿交叉学科学院,长沙410073

出　　处：《控制与决策》2023年第12期3345-3353,共9页Control and Decision

基　　金：黑龙江省自然科学基金项目(LH2021F025);中央高校基本科研业务费专项资金项目(HIT.NSRIF202242);黑龙江省教改项目(SJGY20200185);哈尔滨工业大学研究生教改核心项目(21HX0401)。

摘　　要：当Q学习应用于路径规划问题时,由于动作选择的随机性,以及Q表更新幅度的有限性,智能体会反复探索次优状态和路径,导致算法收敛速度减缓.针对该问题,引入蚁群算法的信息素机制,提出一种寻优范围优化方法,减少智能体的无效探索次数.此外,为提升算法初期迭代的目的性,结合当前栅格与终点位置关系的特点以及智能体动作选择的特性,设计Q表的初始化方法;为使算法在运行的前中后期有合适的探索概率,结合信息素浓度,设计动态调整探索因子的方法.最后,在不同规格不同特点的多种环境中,通过仿真实验验证所提出算法的有效性和可行性.When Q-learning is applied to the path planning problem,due to the randomness of action selection and the limited update range of the Q table,the agent will repeatedly explore sub-optimal states and paths,resulting in slower algorithm convergence.To address this problem,this paper introduces an ant colony pheromone aided Q-learning path planning algorithm,an optimization method for the optimization range is proposed to reduce the invalid exploration times of the agent.In addition,in order to improve the purpose of the initial iteration of the algorithm,according to the characteristics of the relationship between the current grid and the end point and the selection of the agent's action,an initialization method of the Q table is designed.In order to make the algorithm have suitable exploration probability in the early,middle and late stages of operation,a method of dynamically adjusting the exploration factor is designed in combination with the concentration of pheromone.Finally,in a variety of environments with different specifications and different characteristics,the effectiveness and feasibility of the proposed algorithm are verified by simulation experiments.

关键词：Q学习路径规划 Q表初始化探索概率蚁群算法信息素

分类号：TP273[自动化与计算机技术—检测技术与自动化装置]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于蚁群信息素辅助的Q学习路径规划算法被引量：8

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于蚁群信息素辅助的Q学习路径规划算法 被引量：8

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于蚁群信息素辅助的Q学习路径规划算法被引量：8