Performance Evaluation ofMulti-Agent Reinforcement Learning Algorithms  

在线阅读下载全文

作  者:Abdulghani M.Abdulghani Mokhles M.Abdulghani Wilbur L.Walters Khalid H.Abed 

机构地区:[1]Department of Electrical&Computer Engineering and Computer Science,Jackson State University,Jackson,MS 39217,USA

出  处:《Intelligent Automation & Soft Computing》2024年第2期337-352,共16页智能自动化与软计算(英文)

基  金:supported in part by United States Air Force Research Institute for Tactical Autonomy(RITA)University Affiliated Research Center(UARC);in part by the United States Air Force Office of Scientific Research(AFOSR)Contract FA9550-22-1-0268 awarded to KHA,https://www.afrl.af.mil/AFOSR/;The contract is entitled:“Investigating Improving Safety of Autonomous Exploring Intelligent Agents with Human-in-the-Loop Reinforcement Learning,”and in part by Jackson State University.

摘  要:Multi-Agent Reinforcement Learning(MARL)has proven to be successful in cooperative assignments.MARL is used to investigate how autonomous agents with the same interests can connect and act in one team.MARL cooperation scenarios are explored in recreational cooperative augmented reality environments,as well as realworld scenarios in robotics.In this paper,we explore the realm of MARL and its potential applications in cooperative assignments.Our focus is on developing a multi-agent system that can collaborate to attack or defend against enemies and achieve victory withminimal damage.To accomplish this,we utilize the StarCraftMulti-Agent Challenge(SMAC)environment and train four MARL algorithms:Q-learning with Mixtures of Experts(QMIX),Value-DecompositionNetwork(VDN),Multi-agent Proximal PolicyOptimizer(MAPPO),andMulti-Agent Actor Attention Critic(MAA2C).These algorithms allow multiple agents to cooperate in a specific scenario to achieve the targeted mission.Our results show that the QMIX algorithm outperforms the other three algorithms in the attacking scenario,while the VDN algorithm achieves the best results in the defending scenario.Specifically,the VDNalgorithmreaches the highest value of battle wonmean and the lowest value of dead alliesmean.Our research demonstrates the potential forMARL algorithms to be used in real-world applications,such as controllingmultiple robots to provide helpful services or coordinating teams of agents to accomplish tasks that would be impossible for a human to do.The SMAC environment provides a unique opportunity to test and evaluate MARL algorithms in a challenging and dynamic environment,and our results show that these algorithms can be used to achieve victory with minimal damage.

关 键 词:Reinforcement learning RL MULTI-AGENT MARL SMAC VDN QMIX MAPPO 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象