检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:周志勇[1] 莫非 赵凯 郝云波 钱宇峰 Zhou Zhiyong;Mo Fei;Zhao Kai;Hao Yunbo;Qian Yufeng(School of Mechanical Engineering,Shanghai Dianji University,Shanghai 201306,China;Shanghai Aerospace Equipment Manufacturing General Factory,Shanghai 200245,China)
机构地区:[1]上海电机学院机械学院,上海201306 [2]上海航天设备制造总厂有限公司,上海200245
出 处:《系统仿真学报》2024年第6期1425-1432,共8页Journal of System Simulation
基 金:上海市闵行区重大产业技术攻关计划(2022MH-ZD20)。
摘 要:采用MATLAB物理引擎联合Python搭建了一个六轴机械臂,并模拟带有扰动的复杂控制环境,为机械臂训练提供现实中无法提供的试错环境。使用强化学习中近端优化算法(proximal policy optimization,PPO)算法对传统PID控制算法进行改进,引入多智能体思想,根据PID三个参数对控制系统的不同影响及六轴机械臂的特性,将三个参数分别作为不同的智能个体进行训练,实现多智能体自适应调整参数的新型多智能体自适应PID算法。仿真结果表明:该算法的训练收敛性优于MA-DDPG与MA-SAC算法,与传统PID算法的控制效果相比,在遇到扰动及振荡的情况下,能够更有效地抑制振荡,并具有更低的超调量和调整时间,控制过程更为平缓,有效提高了机械臂的控制精度,证明了该算法的鲁棒性及有效性。A six-axis robotic arm is built and simulated in a complex control environment with disturbances by using MATLAB physics engine and Python,which provides a trial-and-error environment for the robotic arm training that could not be provided in reality.Proximal policy optimization(PPO)algorithm in reinforcement learning is proposed to improve the traditional PID control algorithm.By introducing the multi-agent idea and on the basis of the different effects of the three parameters of PID on control system and the characteristics of the six-axis robotic arm,the three parameters are separately trained as different intelligent individuals to achieve a new multi-agent adaptive PID algorithm with multi-agent adaptive adjustment of parameters.Simulation results show that the algorithm outperforms MA-DDPG and MA-SAC algorithms in training convergence.Compared with the traditional PID algorithm,the algorithm can effectively suppress the disturbances and oscillations,and has lower overshoot and adjustment time,which makes the control process smoother and effectively improves the control accuracy of the robotic arm.The robustness and effectiveness is proved.
关 键 词:强化学习 近端优化算法 自适应PID整定 机械臂 多智能体
分 类 号:TP242.2[自动化与计算机技术—检测技术与自动化装置]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.144.94.139