基于深度强化学习的无人机空中目标自主跟踪  被引量:3

Autonomous Tracking of UAV Aerial Target Based on Deep Reinforcement Learning

在线阅读下载全文

作  者:杨兴昊 宋建梅[1] 佘浩平[1] 吴程杰 杨钦宁 付伟达[3] YANG Xinghao;SONG Jianmei;SHE Haoping;WU Chengjie;YANG Qinning;FU Weida(School of Aerospace Engineering,Beijing Institute of Technology,Beijing 100081,China;China Aero Institute of System Engineering,Beijing 100012,China;DFH Satellite Co.,Ltd.,Beijing 100094,China)

机构地区:[1]北京理工大学宇航学院,北京100081 [2]中国航空系统工程研究所,北京100012 [3]航天东方红卫星有限公司,北京100094

出  处:《计算机测量与控制》2022年第10期88-94,102,共8页Computer Measurement &Control

摘  要:针对空中对接任务中的目标自主跟踪问题,提出了一种基于深度强化学习的端到端的目标跟踪方法;该方法采用近端策略优化算法,Actor网络与Critic网络共享前两层的网络参数,将无人机所拍摄图像作为卷积神经网络的输入,通过策略网络控制多旋翼无人机电机转速,实现端到端的目标跟踪,同时采用shaping方法以加速智能体训练;通过物理引擎Pybullet搭建仿真环境并进行训练验证,仿真结果表明该方法能够达到设定的目标跟踪要求,且具有较好的鲁棒性。Aiming at the problem of target autonomous tracking in the process of aerial docking,an end-to-end target autonomous tracking method based on deep reinforcement learning is proposed.In this method,the near end strategy optimization algorithm is adopted.The Actor network and Critic network share the network parameters of first two floors.The image captured by unmanned aerial vehicles(UAV)is used as the input of convolution neural network.The motor speed of rotor UAV is controlled by the strategy network to achieve the end-to-end autonomous target tracking.At the same time,the shaping method is used to accelerate the agent training.The simulation environment is built by the engine of the Pybullet,and the training verification is carried out.The experimental results show that the method can achieve the set target tracking requirements and has good robustness.

关 键 词:深度强化学习 近端策略优化 无人机 目标跟踪 端到端 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程] V279.2[自动化与计算机技术—控制科学与工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象