Fast UAV path planning in urban environments based on three-step experience buffer sampling DDPG  

在线阅读下载全文

作  者:Shasha Tian Yuanxiang Li Xiao Zhang Lu Zheng Linhui Cheng Wei She Wei Xie 

机构地区:[1]College of Computer Science,South-Central Minzu University,Wuhan,430074,China [2]Hubei Provincial Engineering Research Center for Intelligent Management of Manufacturing Enterprises,Wuhan,430074,China [3]School of Computer Science,Wuhan University,Wuhan,430072,China [4]School of Mathematics&Statistics,South-Central Minzu University,Wuhan,430074,China

出  处:《Digital Communications and Networks》2024年第4期813-826,共14页数字通信与网络(英文版)

基  金:supported in part by the Hubei Provincial Science and Technology Major Project of China(Grant No.2020AEA011);in part by the National Ethnic Affairs Commission of the People’s Republic of China(Training Program for Young and Middle-aged Talents)(No:MZR20007);in part by the National Natural Science Foundation of China(Grant No.61902437);in part by the Hubei Provincial Natural Science Foundation of China(Grant No.2020CFB629);in part by the Application Foundation Frontier Project of Wuhan Science and Technology Program(Grant No.2020020601012267);in part by the Fundamental Research Funds for the Central Universities,South-Central MinZu University(No:CZQ21026);in part by the Special Project on Regional Collaborative Innovation of Xinjiang Uygur Autonomous Region(Plan to Aid Xinjiang with Science and Technology)(2022E02035)。

摘  要:The path planning of Unmanned Aerial Vehicle(UAV)is a critical issue in emergency communication and rescue operations,especially in adversarial urban environments.Due to the continuity of the flying space,complex building obstacles,and the aircraft's high dynamics,traditional algorithms cannot find the optimal collision-free flying path between the UAV station and the destination.Accordingly,in this paper,we study the fast UAV path planning problem in a 3D urban environment from a source point to a target point and propose a Three-Step Experience Buffer Deep Deterministic Policy Gradient(TSEB-DDPG)algorithm.We first build the 3D model of a complex urban environment with buildings and project the 3D building surface into many 2D geometric shapes.After transformation,we propose the Hierarchical Learning Particle Swarm Optimization(HL-PSO)to obtain the empirical path.Then,to ensure the accuracy of the obtained paths,the empirical path,the collision information and fast transition information are stored in the three experience buffers of the TSEB-DDPG algorithm as dynamic guidance information.The sampling ratio of each buffer is dynamically adapted to the training stages.Moreover,we designed a reward mechanism to improve the convergence speed of the DDPG algorithm for UAV path planning.The proposed TSEB-DDPG algorithm has also been compared to three widely used competitors experimentally,and the results show that the TSEB-DDPG algorithm can archive the fastest convergence speed and the highest accuracy.We also conduct experiments in real scenarios and compare the real path planning obtained by the HL-PSO algorithm,DDPG algorithm,and TSEB-DDPG algorithm.The results show that the TSEBDDPG algorithm can archive almost the best in terms of accuracy,the average time of actual path planning,and the success rate.

关 键 词:Unmanned aerial vehicle Path planning Deep deterministic policy gradient Three-step experience buffer Particle swarm optimization 

分 类 号:V279[航空宇航科学与技术—飞行器设计] TP18[自动化与计算机技术—控制理论与控制工程] V249[自动化与计算机技术—控制科学与工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象