一种基于深度强化学习的高速列车自动停车算法  

An Automatic High-Speed Train Parking Method Based on Deep Reinforcement learning

在线阅读下载全文

作  者:张云霞[1] 梁东岳 杨卫华 ZHANG Yunxia;LIANG Dongyue;YANG Weihua(Public Class Teaching Department,Shanxi Financial and Taxation College,Taiyuan 030024,China;School of Mathematics,Taiyuan University of Technology,Jinzhong 030600,China)

机构地区:[1]山西省财政税务专科学校公共课教学部,山西太原030024 [2]太原理工大学数学学院,山西晋中030600

出  处:《太原师范学院学报(自然科学版)》2022年第4期35-43,共9页Journal of Taiyuan Normal University:Natural Science Edition

摘  要:为解决当前高速列车自动停车过程中制动挡位切换时控制模型参数变化的问题,提高列车自动停车人工智能控制算法的性能,提出一种基于深度强化学习的高速列车自动停车制动控制算法.采用实际线路环境数据,以及运营列车数据建立了多车多线路模型的算法环境,提出结合长短期网络与全连接网络的多输入单输出神经网络结构,用来计算高速列车站台自动停车过程中的控制参数,通过综合利用长短期网络的记忆能力和全连接网络的泛化能力,解决制动挡位切换时控制参数寻优难问题.在考虑扰动情况下,与已有的自动停车方法进行结果对比,仿真实验结果表明,本文提出的算法在列车制动过程中挡位切换频率较低,加速度变化较小,具有较高的乘客舒适度,绝对停车误差满足小于0.300 m的要求.To solve the problem of the parameters of the control model change when the braking gear is switched during the current high-speed train packing process, and improve the performance of the artificial intelligence methods for automatic train parking(ATP),an algorithm for ATP based on deep reinforcement learning was proposed.A single point model was used to describe and analyze the motion characteristics of the train in the process.In order to make the reinforcement learning environment more consistent with the real environment in which different trains of the same model and different models stop on multiple entry tracks, the algorithm environment of multi-vehicle and multi-line model was established by using real data of line slope and train operation.A multi-input single-output neural network structure combined with LSTM network and fully connected network was proposed to find the optimal control parameters of the multi-vehicle and multi-line model for ATP.By comprehensively utilizing the memory ability of LSTM network and the generalization ability of fully connected network, the difficult problem of optimizing the control parameters during the switch of brake gear was solved.The simulation experiment compares the results of our algorithm with it of the ATP method based on knowledge and double-depth Q network(KDDQN) under strong disturbance.The simulation results show that our algorithm has better average control effect of the instruction switching, and higher comfort level of passengers.The packing error of our algorithm satisfying the requirement that packing error should be no greater than 0.300 m.

关 键 词:列车自动驾驶 列车自动停车 机器学习 深度强化学习 多车多线路 长短期记忆网络 

分 类 号:U284.48[交通运输工程—交通信息工程及控制]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象