结合注意力机制的多智能体深度强化学习的交通信号控制  

Traffic Signal Control Using Multi-Agent Deep Reinforcement Learning Combined with Attention Mechanism

在线阅读下载全文

作  者:徐晴晴 Qingqing Xu(School of Optoelectronic Information and Computer Engineering,Shanghai University of Science and Technology,Shanghai)

机构地区:[1]上海理工大学光电信息与计算机工程学院,上海

出  处:《运筹与模糊学》2024年第2期373-387,共15页Operations Research and Fuzziology

摘  要:智能交通信号控制方法被越来越多的应用在现实世界中,并且取得了不错的成果。其中,多智能体深度强化学习是一种非常有效的方法,但是,在多交叉口交通信号控制中,大规模的交通网络容易引起严重的维度灾难,而且对于道路环境的特征提取也存在不足。针对以上问题,提出了一种新的多智能体深度强化学习算法,该算法基于双决斗深度Q网络(Double Dueling Deep Q-Network,3DQN),消除了传统强化学习算法对Q值的高估问题。引入了平均场(Mean Field,MF)理论大大减少了状态和动作空间的维度,同时融合了注意力机制对道路环境全面观察,使得智能体获得更准确的环境信息。在城市交通模拟器(Simulation Of Urban Mobility,SUMO)中建模了一个交通网络,模拟真实世界中的交通流,对算法进行评估。实验结果表明,提出的算法在奖励方面相较于DQN、DDPG、MA2C分别增加了64.17%、36.40%、32.55%,证明了所提算法的正确性和优越性。Intelligent traffic signal control methods are increasingly being applied in the real world and have achieved good results.Among them,multi-agent deep reinforcement learning is a very effective method.However,in multi-intersection traffic signal control,large-scale traffic networks are prone to serious dimensional disasters,and there are also shortcomings in feature extraction of road environments.A new multi-agent deep reinforcement learning algorithm is proposed to address the above issues.This algorithm is based on the Double Dueling Deep Q-Network(3DQN)and eliminates the problem of overestimation of values in traditional reinforcement learning algorithms.The introduction of Mean Field(MF)theory greatly reduces the dimensions of state and action space,while integrating attention mechanisms to comprehensively observe the road environment,enabling intelligent agents to obtain more accurate environmental information.A traffic network was modeled in the Simulation of Urban Mobility(SUMO)to simulate real-world traffic flow and evaluate the algorithm.The experimental results show that the proposed algorithm has increased rewards by 64.17%,36.40%,and 32.55%compared to DQN,DDPG,and MA2C,respectively,proving the correctness and superiority of the proposed algorithm.

关 键 词:多智能体深度强化学习 智能交通信号控制 平均场理论 机器学习 

分 类 号:U49[交通运输工程—交通运输规划与管理]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象