基于多维时空层递的交通信号分布式强化学习方法  被引量:1

Traffic Signal Decentralized Reinforcement Learning Method Based on a Multi-perspective Spatio-temporal Hierarchical Structure

在线阅读下载全文

作  者:王福建[1] 范诚睿 周斌 封春房 马东方[4] WANG Fu-jian;FAN Cheng-rui;ZHOU Bin;FENG Chun-fang;MA Dong-fang(College of Civil Engineering and Architecture,Zhejiang University,Hangzhou 310058,Zhejiang,China;Polytechnic Institute,Zhejiang University,Hangzhou 310015,Zhejiang,China;Traffic Management Research Institute,Ministry of Public Security,Wuxi 214151,Jiangsu,China;Ocean College,Zhejiang University,Zhoushan 316021,Zhejiang,China)

机构地区:[1]浙江大学建筑工程学院,浙江杭州310058 [2]浙江大学工程师学院,浙江杭州310015 [3]公安部交通管理科学研究所,江苏无锡214151 [4]浙江大学海洋学院,浙江舟山316021

出  处:《中国公路学报》2024年第7期250-263,共14页China Journal of Highway and Transport

基  金:国家自然科学基金项目(52172334);浙江省智能交通工程技术研究中心开放课题项目(2023ERCITZJ-KF09);浙江省教育厅科研项目(Y202353473)。

摘  要:信号控制是智能交通系统的重要组成部分,融合人工智能等新技术的信号优化逐渐成为研究热点,具体策略可分为集中式和分布式2类。分布式控制的轻量化状态空间可以有效避免深度强化学习中的维度灾难问题,近年来愈发受到研究者关注。现有的分布式协同控制策略多以图卷积网络或图注意力网络为基础挖掘路口的耦合关系,但对路口状态之间的时空关联性随时变交通流的动态变化特征考虑不足。为此,首先基于门控循环神经网络建立时变交通流特征的提取方法,确定多路口时空关联度;其次采用图注意力机制搭建区域时空特征的层递融合算法,以路口重要度为指标实现状态空间重构;再次,采用全连接理念面向自适应相位相序结构构造路口通行权切换决策模型。最后,基于实际路网仿真测试了模型控制效果。结果表明:相比于传统分布式强化学习算法,该模型在低、中、高3种流量下的车辆平均排队长度分别降低了13.74%、5.03%、6.30%以上,表明了新方法的潜在应用价值。Signal control is a main feature of intelligent transportation systems,and the integration of artificial intelligence with other technologies for traffic signal control has become a major issue.Signal control strategies can be divided into two categories:centralized and decentralized.Decentralized control methods use light-state spaces that effectively avoid the dimensional catastrophe problem in deep reinforcement learning,and these methods have received increasing attention in recent years.Existing multi-intersection decentralized coordinated strategies mostly use graph convolutional networks or graph attention networks to learn the spatial relationships between intersections.However,they do not give sufficient attention to the spatio-temporal correlations related to time-varying traffic flows between intersections.Accordingly,this study constructed a time-varying traffic flow feature extraction method based on gated recurrent neural networks and calculated the spatiotemporal correlation of multiple intersections.A regional spatiotemporal feature fusion method was then developed using a graph attention mechanism,which achieves a deep integration of traffic flow characteristics through a multi-perspective spatiotemporal hierarchical structure,The study also proposed a state-space reconstruction method based on the importance of intersections.An intersection right-of-way switching decision model with adaptive phase-sequence structures based on a fully connected concept was then developed.Finally,the model was tested under simulation of a real-world road network.The results show that compared with the traditional decentralized reinforcement learning method,the developed model reduces the average vehicle queue length by 13.74%,5.03%,and 6.30%for low-,medium-,and high-traffic flows,respectively,indicating the potential application value of the proposed method.

关 键 词:交通工程 智能交通 深度强化学习 信号控制 多角度时空学习 层递学习 

分 类 号:U491.5[交通运输工程—交通运输规划与管理]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象