基于多智体强化学习的高效率货物列车运行动态调整方法  被引量:3

High-efficiency Freight Train Rescheduling Enabled by Multi-agent Reinforcement Learning

在线阅读下载全文

作  者:蒋灵明 倪少权[1] JIANG Lingming;NI Shaoquan(School of Transportation and Logistics,Southwest Jiaotong University,Chengdu 611756,China;China Railway Signal and Communication Research&Design Institute Group Co.,Ltd.,Beijing 100070,China.)

机构地区:[1]西南交通大学交通运输与物流学院,四川成都611756 [2]北京全路通信信号研究设计院集团有限公司,北京100070

出  处:《铁道学报》2023年第8期27-35,共9页Journal of the China Railway Society

基  金:国家重点研发计划(2017YFB1200700)。

摘  要:为高效降低货物列车延误的影响,提出基于多智能体强化学习的大规模列车运行计划动态调整方法。与单智能体算法显著不同,该方法提出将上行列车和下行列车设计为两个独立的智能体,进行独立学习与竞争,从而大幅度降低运行计划动态调整的算法复杂度。实验结果表明:对于N列货物列车,算法复杂度以指数函数趋势从O(2^(N))降低到O[2^((1+N/2))],计算效率以指数函数趋势提高了2^((N/2-1))倍。以包头—神木铁路万南场—东胜区段为场景(包含22列货物列车和9个车站),测试并统计不同延误扰动下动态调整优化结果:该多智能体方法将计算时间降低至但智能体方法的1%以下。尤其是针对大规模运行计划动态调整(22列车),该多智能体方法基于通用计算机仍可快速获得优化结果,而单智能体方法因计算时间过大(超过3000 d)而无法完成任务。该多智能体强化学习方法对于大规模场景的高效和实时货物列车运行计划动态调整具有重要理论意义和应用价值。A high-efficient multi-agent reinforcement learning(MARL)approach was proposed to handle the challenging large-scale timetable rescheduling of freight trains suffering delay disturbances.Unlike the single-agent reinforcement learning(SARL)algorithm,the proposed MARL algorithm assigned the upstream trains and the downstream trains into two individual agents,providing hybrid strategy of isolated learning and competing between two agents.Consequently,the algorithm complexity was exponentially reduced from O(2^(N))to O[2^((1+N/2))],such that the computing efficiency was significantly improved by 2^((N/2-1)) times for N freight trains.The proposed MARL algorithm was tested in Baotou—Shengmu freight railway(North China)under different delay disturbances,including 22 freight trains and 9 stations.For the case demonstrations,the MARL algorithm has the computing time reduced to less than 1%of the time of the SARL algorithm.More importantly,the MARL algorithm can handle large-scale rescheduling involving 22 trains and quickly obtain optimization results using a personal computer,while the SARL algorithm fails due to the excessive computing time of over 3000 d.Hence,this proposed MARL algorithm enables high efficiency and real-time response for freight train rescheduling among large-scale scenarios in real world.

关 键 词:运行计划动态调整 多智体强化学习 货物列车 计算效率 算法复杂度 

分 类 号:U292.4[交通运输工程—交通运输规划与管理] U294.1[交通运输工程—道路与铁道工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象