面向物流机器人的改进Q-Learning动态避障算法研究  

Improved Q-Learning Dynamic Obstacle Avoidance Algorithm for Logistics Robots

作  者:王力 赵全海 黄石磊 WANG Li;ZHAO Quanhai;HUANG Shilei(Luohe Cigarette Factory,Henan Tobacco Industry Co.,Ltd.,Luohe 462005,China;Information Engineering University,Strategic Support Force of the Chinese People’s Liberation Army,Zhengzhou 450000,China)

机构地区:[1]河南中烟工业有限责任公司漯河卷烟厂,河南漯河462005 [2]中国人民解放军战略支援部队信息工程大学,郑州450000

出  处:《计算机测量与控制》2025年第3期267-274,共8页Computer Measurement &Control

基  金:中国航发产学研合作项目(HFZL2021CXY007);航空发动机及燃气轮机基础科学中心项目(P2022-B-V-002-001)。

摘  要:为提升物流机器人(AMR)在复杂环境中的自主导航与避障能力,改善传统Q-Learning算法在动态环境中的收敛速度慢、路径规划不够优化等问题;研究引入模糊退火算法对Q-Learning算法进行路径节点和搜索路径优化,删除多余节点和非必要转折;并为平衡好Q-Learning算法的探索和利用问题,提出以贪婪法优化搜索策略,并借助改进动态窗口法对进行路径节点和平滑加速改进,实现局部路径规划,以提高改进Q-Learning算法在AMR动态避障中的搜索性能和效率;结果表明,改进Q-Learning算法能有效优化搜索路径,能较好避开动态障碍物和静态障碍物,与其他算法的距离差幅至少大于1 m;改进算法在局部路径中的避障轨迹更趋近于期望值,最大搜索时间不超过3 s,优于其他算法,且其在不同场景下的避障路径长度和运动时间减少幅度均超过10%,避障成功率超过90%;研究方法能满足智慧仓储、智能制造等工程领域对物流机器人高效、安全作业的需求。To enhance the autonomous navigation and obstacle avoidance capabilities of logistics robots(AMRs)in complex environments,and to address the issues of slow convergence speed and not optimized path planning in traditional Q-Learning algorithms within dynamic settings,this paper introduces a fuzzy annealing algorithm to optimize path nodes and search paths in the Q-Learning algorithm,and eliminate redundant nodes and unnecessary transitions.To balance the exploration and exploitation in the Q-Learning algorithm,a greedy method is proposed to refine the search strategy.Additionally,an improved dynamic window method is employed to enhance path nodes and smooth acceleration,and implement effective local path planning,which significantly improves the search performance and efficiency of the enhanced Q-Learning algorithm in AMR dynamic obstacle avoidance scenarios.The results demonstrate that the improved Q-learning algorithm effectively optimizes the search path,adeptly avoiding both dynamic and static obstacles,with a distance advantage of at least 1 m over other algorithms.The obstacle avoidance trajectory of the improved algorithm in local path planning is closer to expected values,achieving a maximum search time of no more than 3 s,which outperforms other algorithms.Furthermore,the enhanced algorithm reduces the obstacle avoidance path length and motion time by more than 10%in various scenarios,with an obstacle avoidance success rate of over 90%.This algorithm effectively meets the needs for efficient and safe operations of logistics robots in engineering fields such as smart warehousing and intelligent manufacturing.

关 键 词:物流机器人 Q-Learning算法 DWA 多目标规划 障碍物 避障 

分 类 号:TP242[自动化与计算机技术—检测技术与自动化装置]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象