UAV maneuvering decision-making algorithm based on deep reinforcement learning under the guidance of expert experience  

在线阅读下载全文

作  者:ZHAN Guang ZHANG Kun LI Ke PIAO Haiyin 

机构地区:[1]School of Electronics and Information,Northwestern Polytechnical University,Xi’an 710072,China [2]Science and Technology on Electro-Optic Control Laboratory,Luoyang 471009,China

出  处:《Journal of Systems Engineering and Electronics》2024年第3期644-665,共22页系统工程与电子技术(英文版)

基  金:supported by the Key Research and Development Program of Shaanxi (2022GXLH-02-09);the Aeronautical Science Foundation of China (20200051053001);the Natural Science Basic Research Program of Shaanxi (2020JM-147)。

摘  要:Autonomous umanned aerial vehicle(UAV) manipulation is necessary for the defense department to execute tactical missions given by commanders in the future unmanned battlefield. A large amount of research has been devoted to improving the autonomous decision-making ability of UAV in an interactive environment, where finding the optimal maneuvering decisionmaking policy became one of the key issues for enabling the intelligence of UAV. In this paper, we propose a maneuvering decision-making algorithm for autonomous air-delivery based on deep reinforcement learning under the guidance of expert experience. Specifically, we refine the guidance towards area and guidance towards specific point tasks for the air-delivery process based on the traditional air-to-surface fire control methods.Moreover, we construct the UAV maneuvering decision-making model based on Markov decision processes(MDPs). Specifically, we present a reward shaping method for the guidance towards area and guidance towards specific point tasks using potential-based function and expert-guided advice. The proposed algorithm could accelerate the convergence of the maneuvering decision-making policy and increase the stability of the policy in terms of the output during the later stage of training process. The effectiveness of the proposed maneuvering decision-making policy is illustrated by the curves of training parameters and extensive experimental results for testing the trained policy.

关 键 词:unmanned aerial vehicle(UAV) maneuvering decision-making autonomous air-delivery deep reinforcement learning reward shaping expert experience 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程] V279[自动化与计算机技术—控制科学与工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象