检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:洪子祺 许文波[2] 吕晨 欧阳权 王志胜[1] HONG Zi-qi;XU Wen-bo;LV Chen;OUYANG Quan;WANG Zhi-sheng(School of Automation Engineering,Nanjing University of Aeronautics and Astronautics,Nanjing 210016,China;Laboratory of Aerospace Servo Actuation and Transmission,Beijing Institute of Precision Mechatronics and Controls,Beijing 100076,China)
机构地区:[1]南京航空航天大学自动化学院,江苏南京210016 [2]北京精密机电控制设备研究所航天伺服驱动与传动技术研究室,北京100076
出 处:《机电工程》2023年第7期1071-1078,共8页Journal of Mechanical & Electrical Engineering
基 金:航天伺服驱动与传动技术实验室开放基金资助项目(LASAT-20210502)。
摘 要:针对传统比例积分控制难以选定控制性能更好参数的问题,以空气舵伺服系统为研究对象,提出了一种基于遗传算法优化的强化学习-PI的控制方法。首先,建立了空气舵伺服系统的数学模型;然后,采用遗传算法优化了PI控制器的初始参数;采用深度确定性策略梯度算法对当前PI控制器进行了实时整定,从而实现了对空气舵伺服系统进行位置指令控制的功能;最后,在Simulink中通过仿真分析,对所采用的方法应用于空气舵伺服系统的效果进行了验证。研究结果表明:改进的算法在参数摄动时,具备一定的在线稳定性;在空载情况下,所需要的调节时间要小于遗传算法-PI、DDPG-PI与传统PI算法,至少缩短了20%;同时,在负载情况下,相比其他3种方法,改进算法的波动幅值与负载结束后回到稳态时间至少缩短了15%,证明了所使用方法在空气舵伺服系统里的有效性。Aiming at the problem that traditional proportional integral(PI)control was difficult to select parameters with better control performance,taking the air rudder servo system as the research object,a control method of reinforcement learning-PI based on genetic algorithm optimization was proposed.Firstly,the mathematical model of the air rudder servo system was established.Then,the initial parameters of PI controller were optimized by genetic algorithm.The current PI controller was adjusted in real time using the deep deterministic policy gradient(DDPG)algorithm to realize the position command control of the air rudder servo system.Finally,the effect of the method used in the air rudder servo system was verified in Simulink through simulation analysis.The results show that the improved algorithm has certain online stability when the parameters are perturbed.In the case of no load,the required adjustment time is less than that of genetic algorithm-PI,DDPG-PI and traditional PI algorithm,and it is increased by at least 20%.At the same time,in the case of load,the fluctuation amplitude of the improved algorithm is at least 15%better than that of the other three methods compared with the time to return to steady state after the end of load,which proves the effectiveness of the method used in the air rudder servo system.
关 键 词:伺服系统 比例积分(PI)控制器 遗传算法 深度确定性策略梯度算法 参数优化 SIMULINK
分 类 号:TH-39[机械工程] TJ765[兵器科学与技术—武器系统与运用工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15