检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:郭梦杰 任安虎[1] Guo Mengjie;Ren Anhu(Electronic Information Engineering College,Xi’an Technological University,Xi’an 710000,China)
机构地区:[1]西安工业大学电子信息工程学院,西安710000
出 处:《电子测量技术》2019年第24期49-52,共4页Electronic Measurement Technology
基 金:陕西省科技厅项目(2018GY-153);西安未央区科技局项目(201833)。
摘 要:针对交通拥堵问题,利用深度强化学习与交通信号控制相结合的方法,构造一个单路口的道路模型,将交通信号控制问题转化为一个Agent在离散时间步长上与交叉口交互的强化学习问题,将交叉口的等待时间作为目标函数。利用强化学习的决策能力和深度学习的感知能力,使得智能体Agent在观测到环境状态后选择出当前状态下可能的最优控制策略并执行,并根据奖赏函数来更新下一时刻的状态。在仿真软件SUMO上进行仿真实验,与定时控制模式相比,所提出的方法在不同饱和度流量下的车辆等待时间均有不同程度的提升,验证了算法的有效性。Aiming at the problem of traffic congestion, using a combination of deep reinforcement learning and traffic signal control, a road model with a single intersection is constructed, and the traffic signal control problem is transformed into an enhanced learning problem that the agent interacts with the intersection in discrete time steps. The waiting time of the intersection is taken as the objective function. Using the decision-making ability of reinforcement learning and the perceptual ability of deep learning, the Agent Agent selects the optimal control strategy in the current state and executes it after observing the environment state, and updates the state of the next moment according to the reward function. The simulation experiment is carried out on the simulation software SUMO. Compared with the timing control mode, the proposed method has different degrees of vehicle waiting time under different saturation flows, which verifies the effectiveness of the algorithm.
关 键 词:深度学习 交通信号控制 DEEP Q-network SUMO
分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:13.58.215.45