基于改进的深度强化学习多智能体协作方法  被引量:4

Multi-agent collaboration method based on improved deep reinforcement learning

在线阅读下载全文

作  者:孙英博 苗国英[1] 庄亚楠 SUN Yingbo;MIAO Guoying;ZHUANG Ya’nan(School of Automation,Nanjing University of Information Science&Technology,Nanjing 210044,China)

机构地区:[1]南京信息工程大学自动化学院,江苏南京210044

出  处:《传感器与微系统》2023年第9期25-29,共5页Transducer and Microsystem Technologies

基  金:国家自然科学基金资助项目(62073169);江苏省“333工程”项目(BRA2020067)。

摘  要:针对多智能体深度强化学习在值函数拟合过程中未充分考虑智能体之间的作用关系,且动作大概率随机,导致迭代试错过程的数据浪费、协作效率低、收敛速度慢等问题,提出了一种在协作中的平均权重机制和改进的探索策略。首先,利用平均深度Q学习网络(DQN)在多智能体的值函数策略网络中设计一种权重结构,减小智能体间的不利影响;其次,改进探索策略,利用欧氏距离提高智能体的探索效率与策略协作性,增大系统跳出局部最小点的能力。通过多个场景实验的结果表明,所提方法提高了多智能体的学习能力和学习效率。Aiming at the problem that multi-agent deep reinforcement learning,the action relationship between the agents is not fully considered in value function fitting process,and the action is random with high probability,which leads to data waste of iterative process of trial error,low collaboration efficiency,slow convergence speed,and so on,an average weight mechanism in collaboration and an improved exploration strategy are proposed.Firstly,the average deep Q learning network(DQN)is used to design a weight structure in the multi-agent value function strategy network to reduce the adverse influence among agents.Secondly,the exploration strategy is improved by using Euclidean distance,which not only improves exploration efficiency of the agent and strategic collaboration,but also increases the ability of the system to jump out of the local minimum point.The results of experiments in multiple scenarios show that the proposed method improves the learning ability and learning efficiency of multi-agents.

关 键 词:多智能体 深度强化学习 平均权重 协作 策略 

分 类 号:TP242[自动化与计算机技术—检测技术与自动化装置]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象