An Empirical Study on Google Research Football Multi-agent Scenarios  

在线阅读下载全文

作  者:Yan Song He Jiang Zheng Tian Haifeng Zhang Yingping Zhang Jiangcheng Zhu Zonghong Dai Weinan Zhang Jun Wang 

机构地区:[1]Institute of Automation,Chinese Academy of Sciences,Beijing,100190,China [2]Digital Brain Lab,Shanghai,200001,China [3]ShanghaiTech University,Shanghai,200001,China [4]Huawei Cloud,Guiyang,550003,China [5]Shanghai Jiao Tong University,Shanghai,200001,China [6]University College London,London,WC1E 6PT,UK

出  处:《Machine Intelligence Research》2024年第3期549-570,共22页机器智能研究(英文版)

基  金:supported by the National Natural Science Foundation of China(No.62206289).

摘  要:Few multi-agent reinforcement learning (MARL) researches on Google research football (GRF) focus on the 11-vs-11 multi-agent full-game scenario and to the best of our knowledge, no open benchmark on this scenario has been released to the public. In this work, we fill the gap by providing a population-based MARL training pipeline and hyperparameter settings on multi-agent football scenario that outperforms the bot with difficulty 1.0 from scratch within 2 million steps. Our experiments serve as a reference for the expected performance of independent proximal policy optimization (IPPO), a state-of-the-art multi-agent reinforcement learning algorithm where each agent tries to maximize its own policy independently across various training configurations. Meanwhile, we release our training framework Light-MALib which extends the MALib codebase by distributed and asynchronous implementation with additional analytical tools for football games. Finally, we provide guidance for building strong football AI with population-based training and release diverse pretrained policies for benchmarking. The goal is to provide the community with a head start for whoever experiment their works on GRF and a simple-to-use population-based training framework for further improving their agents through self-play. The implementation is available at https://github.com/Shanghai-Digital-Brain-Laboratory/DB-Football.

关 键 词:Multi-agent reinforcement learning(RL) distributed RL system population-based training reward shaping game theory 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象