基于PPO的自适应PID控制算法研究

Adaptive PID Control Algorithm Based on PPO

作　　者：周志勇[1] 莫非赵凯郝云波钱宇峰 Zhou Zhiyong;Mo Fei;Zhao Kai;Hao Yunbo;Qian Yufeng(School of Mechanical Engineering,Shanghai Dianji University,Shanghai 201306,China;Shanghai Aerospace Equipment Manufacturing General Factory,Shanghai 200245,China)

机构地区：[1]上海电机学院机械学院,上海201306 [2]上海航天设备制造总厂有限公司,上海200245

出　　处：《系统仿真学报》2024年第6期1425-1432,共8页Journal of System Simulation

基　　金：上海市闵行区重大产业技术攻关计划(2022MH-ZD20)。

摘　　要：采用MATLAB物理引擎联合Python搭建了一个六轴机械臂,并模拟带有扰动的复杂控制环境,为机械臂训练提供现实中无法提供的试错环境。使用强化学习中近端优化算法(proximal policy optimization,PPO)算法对传统PID控制算法进行改进,引入多智能体思想,根据PID三个参数对控制系统的不同影响及六轴机械臂的特性,将三个参数分别作为不同的智能个体进行训练,实现多智能体自适应调整参数的新型多智能体自适应PID算法。仿真结果表明:该算法的训练收敛性优于MA-DDPG与MA-SAC算法,与传统PID算法的控制效果相比,在遇到扰动及振荡的情况下,能够更有效地抑制振荡,并具有更低的超调量和调整时间,控制过程更为平缓,有效提高了机械臂的控制精度,证明了该算法的鲁棒性及有效性。A six-axis robotic arm is built and simulated in a complex control environment with disturbances by using MATLAB physics engine and Python,which provides a trial-and-error environment for the robotic arm training that could not be provided in reality.Proximal policy optimization(PPO)algorithm in reinforcement learning is proposed to improve the traditional PID control algorithm.By introducing the multi-agent idea and on the basis of the different effects of the three parameters of PID on control system and the characteristics of the six-axis robotic arm,the three parameters are separately trained as different intelligent individuals to achieve a new multi-agent adaptive PID algorithm with multi-agent adaptive adjustment of parameters.Simulation results show that the algorithm outperforms MA-DDPG and MA-SAC algorithms in training convergence.Compared with the traditional PID algorithm,the algorithm can effectively suppress the disturbances and oscillations,and has lower overshoot and adjustment time,which makes the control process smoother and effectively improves the control accuracy of the robotic arm.The robustness and effectiveness is proved.

关键词：强化学习近端优化算法自适应PID整定机械臂多智能体

分类号：TP242.2[自动化与计算机技术—检测技术与自动化装置]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于PPO的自适应PID控制算法研究

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于PPO的自适应PID控制算法研究

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索