检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:祁璇 周通 王村松 彭孝天 彭浩[1] QI Xuan;ZHOU Tong;WANG Cunsong;PENG Xiaotian;PENG Hao(School of Mechanical and Power Engineering,Nanjing Tech University,Nanjing 211816,China;Institute of Intelligent Manufacturing,Nanjing Tech University,Nanjing 210009,China)
机构地区:[1]南京工业大学机械与动力工程学院,江苏南京211816 [2]南京工业大学智能制造研究院,江苏南京210009
出 处:《计算机集成制造系统》2025年第3期955-964,共10页Computer Integrated Manufacturing Systems
基 金:国家重点研发计划资助项目(2021YFB3301300);国家自然科学基金资助项目(62203213);江苏省优秀博士后人才资助项目“卓博计划”(2023ZB756)。
摘 要:自动引导车(AGV)是一种具有高度柔性和灵活性的自动化物料运输设备,可实现路径规划、任务调度和智能分配等功能。目前关于AGV最优路径与调度算法研究仍存在泛化性差、收敛效率低、寻路时间长等问题。因此,提出一种改进近端策略优化算法(PPO)。首先,采用多步长动作选择策略增加AGV移动步长,将AGV动作集由原来的4个方向基础上增加了8个方向,优化最优路径;其次,改进动态奖励值函数,根据AGV当前状态实时调整奖励值大小,提高其学习能力;然后,基于不同改进方法比较其奖励值曲线图,验证算法收敛效率与最优路径距离;最后,采用多任务调度优化算法,设计了一种单AGV多任务调度优化算法,提高运输效率。结果表明:改进后的算法最优路径缩短了28.6%,改进后的算法相比于PPO算法收敛效率提升了78.5%,在处理更为复杂、需要高水平策略的任务时表现更佳,具有更强的泛化能力;将改进后的算法与Q学习、深度Q学习(DQN)算法、软演员-评论家(SAC)算法进行比较,算法效率分别提升了84.4%、83.7%、77.9%;单AGV多任务调度优化后,平均路径缩短了47.6%。Automated Guided Vehicle(AGV)is a type of automated material handling equipment with high flexibility and adaptability.The current research on optimal path and scheduling algorithms for AGVs still faces problems such as poor generalization,low convergence efficiency,and long routing time.Therefore,an improved Proximal Policy Optimization(PPO)algorithm was proposed.By adapting a multi-step action selection strategy to increase the step length of AGV movement,the AGV action set was expanded from the original 4 directions by 8 directions for optimizing the optimal path.The dynamic reward function was improved to adjust the reward value in real time based on the current state of AGV for enhancing its learning ability.Then,the reward value curves were compared based on different improvement methods to validate the convergence efficiency of the algorithm and the distance of the optimal path.Finally,by employing a continuous task scheduling optimization algorithm,a novel single AGV continuous task scheduling optimization algorithm had been developed to enhance transportation efficiency.The results showed that the improved algorithm shortened the optimal path by 28.6%and demonstrated a 78.5%increase in convergence efficiency compared to the PPO algorithm.It outperformed in handling more complex tasks that require high-level policies and exhibits stronger generalization capabilities.Compared to Q-Learning,Deep Q-Network(DQN)algorithm and Soft Actor Critical(SAC)algorithm,the improved algorithm showed efficiency improvements of 84.4%,83.7%,and 77.9%respectively.After the optimization of continuous task scheduling for a single AGV,the average path was reduced by 47.6%.
关 键 词:自动导引小车 路径规划 任务调度 近端策略优化算法 强化学习
分 类 号:TP249[自动化与计算机技术—检测技术与自动化装置]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49