基于Nash-Stackelberg分层博弈模型的路网交通控制强化学习算法被引量：2

Road network traffic control reinforcement learning algorithms based on Nash-Stackelberg hierarchical game model

作　　者：张尊栋[1,2] 王岩楠刘雨珂刘小明尚春琳[1,2] Zhang Zundong;Wang Yannan;Liu Yuke;Liu Xiaoming;Shang Chunlin(Beijing Key Laboratory of Urban Intelligent Traffic Control Technology,North China University of Technology,Beijing 100144,China;College of Transportation,Ludong University,Yantai 264205,China)

机构地区：[1]北方工业大学城市道路交通智能控制技术北京市重点实验室,北京100144 [2]鲁东大学交通学院,烟台264205

出　　处：《东南大学学报（自然科学版）》2023年第2期334-341,共8页Journal of Southeast University：Natural Science Edition

基　　金：国家重点研发计划资助项目(2018YFB1601000);轨道交通控制与安全国家重点实验室(北京交通大学)开放课题基金资助项目(RCS2022K007)。

摘　　要：为了解决多交叉口博弈引发的Nash均衡计算复杂度问题,考虑路网中不同交叉口的重要程度和博弈关系,兼顾路网中子区之间及子区内部的交通控制策略,以2个子区内的重要交叉口作为上层博弈主体,次要交叉口作为下层博弈主体,构建了一种Nash-Stackelberg分层博弈(NSHG)模型.然后,提出2种多Agent强化学习算法,即基于NSHG的Q学习(NSHG-QL)算法和基于NSHG的深度Q网络(NSHG-DQN)算法.在实验中,使用NSHG-QL和NSHG-DQN算法在SUMO仿真软件搭建的路网环境中对信号灯进行控制,并与基础博弈模型求解算法进行比较.实验结果表明:NSHG-QL算法和NSHG-DQN算法减少了交叉口内车辆的平均旅行时间和平均时间损失,提高了平均速度;NSHG模型在满足重要交叉口间上层博弈的基础上协调次要交叉口,做出最优策略选择,而且基于分层博弈模型的多Agent强化学习算法能明显提高学习性能和收敛性.To solve the problem of computational complexity for the Nash equilibrium caused by multi-intersection game,with the consideration of the importance of different intersections in the road network and the game relationships between the intersections,a Nash-Stackelberg hierarchical game(NSHG)model is proposed which takes into account the traffic control strategies between and within the sub-areas of the road network,with the important intersections in the two sub-areas as the game subject at the upper layer and the secondary intersections as the game subject at the lower layer.Two multi-agent reinforcement learning(MARL)algorithms,NSHG-based Q learning(NSHG-QL)algorithm and NSHG-based deep Q network(NSHG-DQN)algorithm are proposed.In the experiments,the signals are controlled using NSHG-QL and NSHG-DQN algorithms in the road network environment built by SUMO simulation software and compared with the base game model solution algorithm.The experimental results show that,NSHG-QL and NSHG-DQN algorithms can reduce the average travel time and the time loss of vehicles at the intersections,and increase the average speed.NSHG model can coordinate the secondary intersections to make optimal strategy selections on the basis of satisfying the upper-layer game between the important intersections.Moreover,the MARL algorithms based on the hierarchical game model can significantly improve learning performance and convergence.

关键词：计算复杂度交通控制策略分层博弈模型多AGENT强化学习最优策略

分类号：TP181[自动化与计算机技术—控制理论与控制工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于Nash-Stackelberg分层博弈模型的路网交通控制强化学习算法被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于Nash-Stackelberg分层博弈模型的路网交通控制强化学习算法 被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于Nash-Stackelberg分层博弈模型的路网交通控制强化学习算法被引量：2