基于强化学习的多路口可变车道协同控制方法  被引量:2

Cooperative control algorithm of multi-intersection variable-direction lanes based on reinforcement learning

在线阅读下载全文

作  者:徐小高 夏莹杰[1] 朱思雨 邝砾[2] XU Xiao-gao;XIA Ying-jie;ZHU Si-yu;KUANG Li(College of Computer Science and Technology,Zhejiang University,Hangzhou 310027,China;School of Computer Science and Engineering,Central South University,Changsha 410012,China)

机构地区:[1]浙江大学计算机科学与技术学院,浙江杭州310027 [2]中南大学计算机学院,湖南长沙410012

出  处:《浙江大学学报(工学版)》2022年第5期987-994,1005,共9页Journal of Zhejiang University:Engineering Science

基  金:国家自然科学基金资助项目(61873232)。

摘  要:为了解决传统的可变导向车道控制方法无法适应多路口场景下的复杂交通流的问题,提出基于多智能体强化学习的多路口可变导向车道协同控制方法来缓解多路口的交通拥堵状况.该方法对多智能体强化学习(QMIX)算法进行改进,针对可变导向车道场景下的全局奖励分配问题,将全局奖励分解为基本奖励与绩效奖励,提高了拥堵场景下对车道转向变化的决策准确性.引入优先级经验回放算法,以提升经验回放池中转移序列的利用效率,加速算法收敛.实验结果表明,本研究所提出的多路口可变导向车道协同控制方法在排队长度、延误时间和等待时间等指标上的表现优于其他控制方法,能够有效协调可变导向车道的策略切换,提高多路口下路网的通行能力.A cooperative control algorithm of multi-intersection variable-direction lanes based on multi-agent reinforcement learning was proposed to alleviate the congestion of multi-intersection,in order to solve the problem that traditional variable-direction lane control method can’t adapt to the complex traffic flow problem under multiple intersections scenarios.In this method,the deep multi-agent reinforcement learning(QMIX)algorithm was improved.The global reward under variable-direction lane scenarios was composed of basic reward and performance reward,which improved the decision-making accuracy of lane turn control in congestion scenarios.The priority experience playback algorithm was introduced to improve the utilization efficiency of the transfer sequence in the experience playback pool and accelerate the algorithm convergence.Experimental results show that the algorithm has better performance than other control methods in case of queue length,delay times and waiting times,which can effectively coordinate the policy switch of the variable-direction lanes and improve the road network capacity in the multi-intersection scenarios.

关 键 词:可变导向车道 强化学习 多智能体 自适应控制 智能交通 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象