多指标优化的深度强化学习单交叉口信号控制  被引量:4

Multi-metric optimized deep reinforcement learning single intersection signal control

在线阅读下载全文

作  者:任安虎[1] 任洋洋 王瑶 Ren Anhu;Ren Yangyang;Wang Yao(School of Electronic and Information Engineering,Xi'an Technological University,Xi'an 71002l,China)

机构地区:[1]西安工业大学电子信息工程学院,西安710021

出  处:《国外电子测量技术》2022年第10期104-111,共8页Foreign Electronic Measurement Technology

基  金:陕西省科技厅项目(2018GY-153);陕西省西安市未央区科技局项目(201833)资助。

摘  要:针对目前交叉口信号灯的控制方法无法有效的应对实时变化的交通状态。该算法提出多指标优化的深度强化学习单交叉口信号控制,以共同优化多指标来定义奖惩函数,动作的选择策略为贪心策略,其中探索率以固定的频率进行余弦衰减,在足够多的探索未知动作的基础上也保证了更好的收敛结果,最后使用SUMO仿真平台对该算法的控制效果进行验证。结果表明,该算法相比固定配时方案和感应控制方案,能够更有效的降低交叉口车辆的延误时间、排队长度、停车次数3项指标,具有更好的适用性跟有效性。The control method for the current intersection signal cannot effectively cope with the real-time change of traffic status. The algorithm proposes multi-indicator optimization of deep reinforcement learning single intersection signal control, to jointly optimize multiple indicators to define the reward and punishment function, the choice strategy of action is greedy strategy, in which the exploration rate is at a fixed frequency for cosine attenuation, on the basis of enough exploration of unknown actions also ensures better convergence results, and finally use SUMO the simulation platform verifies the control effect of the algorithm, and the results show that the algorithm can more effectively reduce the delay time, queuing length and number of stops of intersection vehicles than the fixed timing scheme and the induction control scheme, which has better applicability and effectiveness.

关 键 词:交通信号控制 卷积神经网络 深度强化学习 多指标优化 DQN算法 SUMO仿真 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象