基于改进多智能体PPO的多无人机协同探索方法  被引量:2

A Multi-UAV Cooperative Exploration Method Based on Improved Multi-Agent PPO

在线阅读下载全文

作  者:安城安 周思达 AN Cheng’an;ZHOU Sida(School of Electrical Information Engineering,Yunnan Minzu University,Kunming 650000,China)

机构地区:[1]云南民族大学电气信息工程学院,昆明650000

出  处:《电光与控制》2024年第1期51-56,共6页Electronics Optics & Control

基  金:国家自然科学基金(61963038)。

摘  要:采用多无人机对未知环境进行探索,可以提高探索任务的鲁棒性和执行效率。不同于启发式方法,多智能体深度强化学习方法可以省去人为制定规则的过程,将无人机作为智能体,通过与环境互动,自主习得更加有效的“规则”。搭建了多无人机多线程仿真环境,为多无人机协同训练提供环境,提出一种适应多线程环境的结合长短时循环神经网络(记忆)的共享多智能体近端策略优化(LSTM-MAPPO)方法,并在合作型LSTM-MAPPO方法的基础上增加了全局边界信息以增大每幕探索面积。数值实验结果表明:与现有的多智能体深度确定性策略梯度(MADDPG)方法相比,所提方法在训练后期连续动作下也能稳定收敛;相较于现有的LSTM-MAPPO方法,其最终获得的奖励稳定高于5000;对3种不同的仿真地图,训练完的网络在测试时能实现70%以上的稳定探索面积。Using multiple UAVs to explore unknown environments can improve the robustness and execution efficiency of exploration tasks.Different from the heuristic method,the multi-agent deep reinforcement learning method eliminates the process of making rules artificially,and takes the UAVs as agents to independently learn more effective“rules”by interacting with the environment.A multi-threaded simulation environment for multiple UAVs is built to provide an environment for cooperative training of multiple UAVs.A Long and Short Term Memory neural network-based shared Multi-Agent Proximal Policy Optimization(LSTM-MAPPO)method is proposed to adapt to the multi-threaded environment,and the global boundary information is added on the basis of the cooperative LSTM-MAPPO method to increase the exploration area of each episode.The numerical experiment results show that:1)Compared with the existing Multi-Agent Depth Deterministic Policy Gradient(MADDPG)method,it can converge stably in later periods of training under the continuous action;2)Compared with the existing LSTM-MAPPO method,its final reward is stably above 5000;and 3)On three different simulation maps,the trained network can realize the stable exploration of more than 70%of the area during the test.

关 键 词:多无人机协同 多智能体深度强化学习 未知环境探索 航迹规划 多线程技术 长短时循环神经网络 

分 类 号:V279[航空宇航科学与技术—飞行器设计]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象