检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:崔浩岩 张震 赵德京 廖登宇 CUI Haoyan;ZHANG Zhen;ZHAO Dejing;LIAO Dengyu(School of Automation,Qingdao University,Qingdao 266071,China;Shandong Key Laboratory of Industrial Control Technology,Qingdao 266071,China)
机构地区:[1]青岛大学自动化学院,山东青岛266071 [2]山东省工业控制技术重点实验室,山东青岛266071
出 处:《控制工程》2024年第7期1169-1177,共9页Control Engineering of China
基 金:国家自然科学基金资助项目(61903209);青岛市博士后应用研究项目。
摘 要:针对多智能体系统中智能体通信能力受限和多智能体强化学习中联合动作空间维数灾难问题,提出一种基于一致性的多智能体Q学习(multi-agent Q-learning based on consensus,MAQC)算法。该算法采用集中训练-分散执行框架。在集中训练阶段,MAQC算法采用值分解方法缓解联合动作空间维数灾难问题。此外,每个智能体将自己感知到的局部状态和接收到的邻居的局部状态发送给所有邻居,最终使网络中的智能体获得所有智能体的全局状态。智能体所需的时间差分信息由一致性算法获得,智能体只需向邻居发送时间差分信息的分量信息。在执行阶段,每个智能体只需根据与自己动作有关的Q值函数来选择动作。结果表明,MAQC算法能够收敛到最优联合策略。A multi-agent Q-learning based on consensus(MAQC)algorithm is proposed,which uses a framework of centralized training and decentralized execution to address the problems of limited communi-cation ability of agents in multi-agent systems and joint action space dimension disaster in multi-agent reinforcement learning.In the centralized training stage,MAQC algorithm uses the value decomposition method to alleviate the dimension disaster of joint action space.In addition,each agent sends its perceived local state and the received local state of its neighbors to all neighbors.In this way,the agents in the network can obtain the global state of all agents.The time difference information required by each agent is obtained by the consensus algorithm,and each agent needs to send only the component information of the time difference information to its neighbors.In the execution stage,each agent needs to select the action according to only the Q-value function conditioned on its own action.Experimental results show that MAQC algorithm can converge to the optimal joint strategy.
关 键 词:多智能体强化学习 智能体通信 一致性 Q学习 值分解
分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.219.198.219