基于深度强化学习的四旋翼无人机自主控制方法被引量：5

Autonomous Control Algorithm for Quadrotor Based on Deep Reinforcement Learning

作　　者：梁吉王立松[1] 黄昱洲秦小麟[1] LIANG Ji;WANG Lisong;HUANG Yuzhou;QIN Xiaolin(College of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,China)

机构地区：[1]南京航空航天大学计算机科学与技术学院,南京211106

出　　处：《计算机科学》2023年第S02期1-7,共7页Computer Science

基　　金：国家自然科学基金(61972198)。

摘　　要：随着无人机的广泛应用,无人机控制器的设计成为近年来广泛研究的热点。当前无人机中广泛使用的PID,MPC等控制算法受到参数难调节、模型构建复杂、计算量大等一系列因素的制约。针对上述问题,提出了一种基于深度强化学习的无人机自主控制方法。该方法通过神经网络拟合无人机控制器,直接将无人机的状态映射到舵机的输出以控制无人机运动,在不断与环境进行交互训练中即可得到一个通用的无人机控制器,有效地避免了参数调节、模型构建等复杂操作。同时,为进一步提高模型的收敛速度和准确性,在传统强化学习算法Soft Actor Critic(SAC)的基础之上引入专家信息,提出了ESAC算法,指导无人机对环境进行探索,以增强控制策略的易用性和扩展性。最后在无人机的位置控制以及轨迹跟踪任务中,通过与传统PID控制器和SAC,DDPG等强化学习算法构建的模型控制器进行对比,实验结果表明,通过ESAC算法构建的控制器能够达到与PID控制器同样甚至更优的控制效果,同时在稳定性和准确性上优于SAC和DDPG构建的控制器。With the wide application of UAV,the design of UAV controller has become a hot research topic in recent years.The control algorithms such as PID and MPC widely used in UAV are restricted by a series of factors such as difficult parameter adjustment,complex model construction,and large amount of calculation.Aiming at the above problems,a UAV autonomous control method based on deep reinforcement learning is proposed.This method fits the UAV controller through a neural network,directly maps the state of the UAV to the output of the steering gear to control the movement of the UAV,and can obtain a general UAV controller in the continuous interactive training with the environment.This method effectively avoids complex operations such as parameter adjustment and model building.At the same time,in order to further improve the convergence speed and accuracy of the model,on the basis of the traditional reinforcement learning algorithm soft actor critic(SAC),by introducing expert information,an ESAC algorithm is proposed,which guides the UAV to explore the environment and enhances the ease of control strategy.Finally,in the position control and trajectory tracking tasks of the UAV,compared to the traditional PID controller and the model controller constructed by SAC,DDPG and other reinforcement learning algorithms,experimental results show that the controller constructed by the ESAC algorithm can achieve the same level as the PID controller,and it is better than the controller built by SAC and DDPG in stability and accuracy.

关键词：强化学习四旋翼无人机自主控制专家策略

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于深度强化学习的四旋翼无人机自主控制方法被引量：5

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于深度强化学习的四旋翼无人机自主控制方法 被引量：5

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于深度强化学习的四旋翼无人机自主控制方法被引量：5