完全合作类多智能体深度强化学习综述  被引量:6

Survey of Fully Cooperative Multi-Agent Deep Reinforcement Learning

在线阅读下载全文

作  者:赵立阳 常天庆 褚凯轩 郭理彬 张雷 ZHAO Liyang;CHANG Tianqing;CHU Kaixuan;GUO Libin;ZHANG Lei(Department of Weaponry and Control,Army Academy of Armored Forces,Beijing 100072,China)

机构地区:[1]陆军装甲兵学院兵器与控制系,北京100072

出  处:《计算机工程与应用》2023年第12期14-27,共14页Computer Engineering and Applications

基  金:国家部委预研项目。

摘  要:作为机器学习和人工智能领域的重要分支之一,完全合作类多智能体深度强化学习以一种通用的方式将深度强化学习的表达决策能力和多智能体系统的分布协作能力有效结合,为完全合作类多智能体系统中的无模型序贯决策问题提供了一种端对端的解决方案。对深度强化学习的基本原理进行阐述,并从基于值函数、基于策略梯度和基于演员-评论家三个主要方向对单智能体深度强化学习的发展进行了总结。分析了多智能体深度强化学习面临的主要挑战和主要的训练框架。依据实现最大团队联合奖励方式的不同,将完全合作类的多智能体深度强化学习划分为基于独立学习、基于通信学习、基于协作学习和基于奖励函数塑造四大类,并分别进行了总结分析。从解决实际问题的角度出发,对完全合作类多智能体深度强化学习算法的未来发展方向进行了展望。As one of the important branches in the field of machine learning and artificial intelligence,fully cooperative multi-agent deep reinforcement learning effectively combines the expression and decision-making ability of deep reinforcement learning with the distributed cooperation ability of multi-agent system in a general way,which provides an endto-end solution to the model-free sequential decision-making problem in fully cooperative multi-agent system.Firstly,the basic principles of deep reinforcement learning are described,and the development of single agent deep reinforcement learning is summarized from three main directions:value function based,policy gradient based and actor-critic based.Secondly,the main challenges and training framework of multi-agent deep reinforcement learning are analyzed.Then,according to the different ways of realizing the maximum team joint reward,the fully cooperative multi-agent deep reinforcement learning is divided into four categories:independent learning,communication learning,collaborative learning and reward function shaping.Finally,from the perspective of solving practical problems,the future development direction of fully cooperative multi-agent deep reinforcement learning algorithm is prospected.

关 键 词:深度强化学习 多智能体 完全合作 人工智能 

分 类 号:TP24[自动化与计算机技术—检测技术与自动化装置]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象