检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:赵立阳 常天庆 褚凯轩 郭理彬 张雷 ZHAO Liyang;CHANG Tianqing;CHU Kaixuan;GUO Libin;ZHANG Lei(Department of Weaponry and Control,Army Academy of Armored Forces,Beijing 100072,China)
机构地区:[1]陆军装甲兵学院兵器与控制系,北京100072
出 处:《计算机工程与应用》2023年第12期14-27,共14页Computer Engineering and Applications
基 金:国家部委预研项目。
摘 要:作为机器学习和人工智能领域的重要分支之一,完全合作类多智能体深度强化学习以一种通用的方式将深度强化学习的表达决策能力和多智能体系统的分布协作能力有效结合,为完全合作类多智能体系统中的无模型序贯决策问题提供了一种端对端的解决方案。对深度强化学习的基本原理进行阐述,并从基于值函数、基于策略梯度和基于演员-评论家三个主要方向对单智能体深度强化学习的发展进行了总结。分析了多智能体深度强化学习面临的主要挑战和主要的训练框架。依据实现最大团队联合奖励方式的不同,将完全合作类的多智能体深度强化学习划分为基于独立学习、基于通信学习、基于协作学习和基于奖励函数塑造四大类,并分别进行了总结分析。从解决实际问题的角度出发,对完全合作类多智能体深度强化学习算法的未来发展方向进行了展望。As one of the important branches in the field of machine learning and artificial intelligence,fully cooperative multi-agent deep reinforcement learning effectively combines the expression and decision-making ability of deep reinforcement learning with the distributed cooperation ability of multi-agent system in a general way,which provides an endto-end solution to the model-free sequential decision-making problem in fully cooperative multi-agent system.Firstly,the basic principles of deep reinforcement learning are described,and the development of single agent deep reinforcement learning is summarized from three main directions:value function based,policy gradient based and actor-critic based.Secondly,the main challenges and training framework of multi-agent deep reinforcement learning are analyzed.Then,according to the different ways of realizing the maximum team joint reward,the fully cooperative multi-agent deep reinforcement learning is divided into four categories:independent learning,communication learning,collaborative learning and reward function shaping.Finally,from the perspective of solving practical problems,the future development direction of fully cooperative multi-agent deep reinforcement learning algorithm is prospected.
分 类 号:TP24[自动化与计算机技术—检测技术与自动化装置]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.23.60.252