基于强化迭代学习的四旋翼无人机轨迹控制  被引量:7

Trajectory control of quadrotor based on reinforcement learning-iterative learning

在线阅读下载全文

作  者:刘旭光 杜昌平[1] 郑耀[1] LIU Xuguang;DU Changping;ZHENG Yao(School of Aeronautics and Astronautics,Zhejiang University,Hangzhou Zhejiang 310027,China)

机构地区:[1]浙江大学航空航天学院,杭州310027

出  处:《计算机应用》2022年第12期3950-3956,共7页journal of Computer Applications

摘  要:为进一步提升在未知环境下四旋翼无人机轨迹的跟踪精度,提出了一种在传统反馈控制架构上增加迭代学习前馈控制器的控制方法。针对迭代学习控制(ILC)中存在的学习参数整定困难的问题,提出了一种利用强化学习(RL)对迭代学习控制器的学习参数进行整定优化的方法。首先,利用RL对迭代学习控制器的学习参数进行优化,筛选出当前环境及任务下最优的学习参数以保证迭代学习控制器的控制效果最优;其次,利用迭代学习控制器的学习能力不断迭代优化前馈输入,直至实现完美跟踪;最后,在有随机噪声存在的仿真环境中把所提出的强化迭代学习控制(RL-ILC)算法与未经参数优化的ILC方法、滑模变结构控制(SMC)方法以及比例-积分-微分(PID)控制方法进行对比实验。实验结果表明,所提算法在经过2次迭代后,总误差缩减为初始误差的0.2%,实现了快速收敛;并且与SMC控制方法及PID控制方法相比,RL-ILC算法在算法收敛后不会受噪声影响产生轨迹波动。由此可见,所提算法能够有效提高无人机轨迹跟踪的准确性和鲁棒性。In order to further improve the trajectory tracking accuracy of quadrotor in unknown environment,a control method adding an iterative learning feedforward controller to the traditional feedback control architecture was proposed.Facing the difficulty of tuning learning parameters in the process of Iterative Learning Control(ILC),a method of tuning and optimizing learning parameters of iterative learning controllers using Reinforcement Learning(RL)was proposed.Firstly,RL was used to optimize the learning parameters of iterative learning controller,and the optimal learning parameters under the current environment and tasks were filtered out to ensure the optimal control effect of the iterative learning controller.Then,with the learning ability of iterative learning controllers,the feedforward input was optimized iteratively until the perfect tracking was achieved.Finally,in the simulation environment with random noise,experiments were carried out to compare the proposed Reinforcement Learning-Iterative Learning Control(RL-ILC)algorithm with ILC method without optimizing parameters,Sliding Mode Control(SMC)method and Proportional-Integral-Derivative(PID)control method.Experimental results show that after two iterations,the proposed algorithm has the total error reduced to 0.2%of the initial error,achieving rapid convergence.Compared with SMC method and PID control method,RL-ILC algorithm is not affected by noise and does not produce trajectory fluctuations after algorithm convergence.The results illustrate that the proposed algorithm can effectively improve the trajectory tracking task’s accuracy and robustness.

关 键 词:迭代学习控制 强化学习 四旋翼无人机 参数整定 轨迹跟踪 

分 类 号:TP273[自动化与计算机技术—检测技术与自动化装置]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象