基于强化学习的公交站场服务中断防治策略

A Resistance Strategy for Bus Service Disruption in DepotBased on Reinforcement Learning

作　　者：伦嘉铭姜海明谢康 LUN Jia-ming;JIANG Hai-ming;XIE Kang(School of Electromechanical Engineering,Guangdong University of Technology,Guangzhou Guangdong 510006,China)

机构地区：[1]广东工业大学机电工程学院,广东广州510006

出　　处：《计算机仿真》2024年第4期129-135,425,共8页Computer Simulation

基　　金：国家自然科学基金项目(11874126)、广东省“领军人才”项目(400180001)。

摘　　要：为缓解公交站场的服务中断问题,提出一种基于强化学习的动态发车控制策略。策略利用长短期记忆(LSTM)模型对公交行程时间进行预测,使智能体感知站场车辆与运行车辆的车头时距状态,以更好地评估决策的长期影响。针对站场无车可发的场景,在计算动作概率分布时应用状态相关可微函数将无效动作遮蔽,避免智能体下发无效指令。通过奖励函数对大发车间隔进行惩罚,并使用近端策略优化(PPO)对模型进行训练。仿真结果表明,与传统方法相比,所提方法不仅能有效避免公交站场服务中断,而且使车辆载客率更均衡,乘客等待时间更少,车辆利用效率更高。In order to alleviate the problem of bus service disruption in depot,this paper proposes a dynamic departure control strategy based on reinforcement learning.This strategy uses a long short-term memory(LSTM)model to predict bus travel time,so that the agent can perceive the headway status of the depot vehicle and the running vehicle to better evaluate the long-term impact of the decision made by the agent.For the scenario where there is no bus stop at the depot,the state-dependent differentiable function is used to mask invalid actions when calculating the action probability distribution,so as to avoid invalid commands from the agent.The model is trained using proximal policy optimization(PPO)and penalizes large departure intervals through a reward function.The experimental result shows that,compared with the traditional method,the method proposed in this paper can not only effectively avoid the bus service disruption in the depot,but also make the bus passenger load ratio more balanced,the passenger waiting time shorter,and the vehicle utilization efficiency higher.

关键词：公交服务中断实时控制强化学习近端策略优化无效动作遮蔽

分类号：TP391.9[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于强化学习的公交站场服务中断防治策略

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于强化学习的公交站场服务中断防治策略

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索