多智能体强化学习方法综述  被引量:3

A survey of multi-agent reinforcement learning methods

在线阅读下载全文

作  者:陈人龙 陈嘉礼 李善琦 谭营[1,2,3,4] CHEN Renlong;CHEN Jiali;LI Shanqi;TAN Ying(Key Laboratory of Machine Perceptron(MOE),Peking University,Beijing 100871,China;School of Intelligence Science and Technology,Peking University,Beijing 100871,China;Institute for Artificial Intelligence,Peking University,Beijing 100871,China;National Key Laboratory of General Artificial Intelligence,Peking University,Beijing 100871,China)

机构地区:[1]北京大学机器感知与智能教育部重点实验室,北京100871 [2]北京大学智能学院,北京100871 [3]北京大学人工智能研究院,北京100871 [4]北京大学跨媒体通用人工智能全国重点实验室,北京100871

出  处:《信息对抗技术》2024年第1期18-32,共15页Information Countermeasures Technology

基  金:国家重点研发计划项目(2018AAA0102301);国家自然科学基金资助项目(62250037,62276008,62076010)。

摘  要:在自动驾驶、团队配合游戏等现实场景的序列决策问题中,多智能体强化学习表现出了优秀的潜力。然而,多智能体强化学习面临着维度灾难、不稳定性、多目标性和部分可观测性等挑战。为此,概述了多智能体强化学习的概念与方法,并整理了当前研究的主要趋势和研究方向。研究趋势包括CTDE范式、具有循环神经单元的智能体和训练技巧。主要研究方向涵盖混合型学习方法、协同与竞争学习、通信与知识共享、适应性与鲁棒性、分层与模块化学习、基于博弈论的方法以及可解释性。未来的研究方向包括解决维度灾难问题、求解大型组合优化问题和分析多智能体强化学习算法的全局收敛性。这些研究方向将推动多智能体强化学习在实际应用中取得更大的突破。In real-world scenarios such as autonomous driving and team-based cooperative games,multi-agent reinforcement learning has demonstrated significant potential in tackling sequential decision-making problems.However,it also encounters challenges including the curse of dimensionality,instability,multi-objectivity,and partial observability.This article offers an overview of the concepts and methods employed in multi-agent reinforcement learn-ing,providing a summary of the prevailing trends and research directions in the current stud-ies.The identified research trends comprise the CTDE paradigm,agents equipped with recur-rent neural units,and various training techniques.The primary research directions encom-pass hybrid learning methods,cooperative and competitive learning,communication and knowledge sharing,adaptability and robustness,hierarchical and modular learning,game theoretic approaches,and interpretability.Looking ahead,future research directions entail addressing the curse of dimensionality,solving large-scale combinatorial optimization prob-lems,and conducting analyses on the global convergence of multi-agent reinforcement learn-ing algorithms.Pursuing these research directions will significantly contribute to further breakthroughs in the practical application of multi-agent reinforcement learning.

关 键 词:多智能体强化学习 强化学习 多智能体系统 群体协同 维度灾难 

分 类 号:TN915[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象