检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:孙国锋 景云[1,2] 李和壁 田志强 田小鹏[4] SUN Guofeng;JING Yun;LI Hebi;TIAN Zhiqiang;TIAN Xiaopeng(School of Traffic and Transportation,Beijing Jiaotong University,Beijing 100044,China;Frontiers Science Center for Smart High-speed Railway Systems,Beijing Jiaotong University,Beijing 100044,China;Railway Science and Technology Research and Development Center,China Academy of Railway Sciences Corporation Limited,Beijing 100081,China;School of Traffic and Transportation,Lanzhou Jiaotong University,Lanzhou 730070,China;Key Laboratory of Railway Industry on Plateau Railway Transportation Intelligent Management and Control,Lanzhou Jiaotong University,Lanzhou 730070,China)
机构地区:[1]北京交通大学交通运输学院,北京100044 [2]北京交通大学智慧高速铁路系统前沿科学中心,北京100044 [3]中国铁道科学研究院集团有限公司,铁道科学技术研究发展中心,北京100081 [4]兰州交通大学交通运输学院,兰州730070 [5]兰州交通大学高原铁路运输智慧管控铁路行业重点实验室,兰州730070
出 处:《交通运输系统工程与信息》2024年第2期249-262,共14页Journal of Transportation Systems Engineering and Information Technology
基 金:国家自然科学基金(52372300,72161023);中央高校基本科研业务费专项资金(2023YJS146)。
摘 要:为解释客运产品特征对列车乘车区段客流分布预测的影响,本文提出一种基于可解释机器学习框架的高速铁路列车乘车区段客流分布预测方法。首先,提出基于梯度提升树模型的高速铁路列车乘车区段客流分布预测框架,构建不同梯度提升树模型(GBDT、XGBoost、LightGBM及CatBoost)的高速铁路列车乘车区段客流分布预测模型;其次,计算特征贡献重要度,基于SHAP(SHapley Additive exPlanations)方法实现特征变量优化,揭示单一特征和交互特征与列车乘车区段客流分布预测的非线性关系。北京南—上海虹桥间列车客流分布预测结果表明:4种模型可精准预测客流分布结果,GBDT,XGBoost,LightGBM及CatBoost在测试集的决定系数分别为0.9664,0.9601,0.9680及0.9715;特征优化后,按贡献重要度排序依次为标杆车,票价,旅行时间,日期,星期,车次及出发时间;特征优化后,CatBoost-7模型在验证集中的决定系数为0.9458;日期和标杆车对客流分布预测呈现非线性正相关,旅行时间对客流分布预测呈现非线性负相关,低旅行时间、高票价及出发时间整点的标杆车对客流分布预测产生正向影响。本文研究结果能够为高速铁路客运产品设计提供一定参考价值。In order to clarify the impact of railway passenger transportation services on the prediction of passenger flow distribution,we propose a method based on an interpretable machine learning framework to predict passenger flow distribution in high-speed railway sections.First,we propose a framework capable of predicting passenger flow distribution in sections by using gradient-boosted tree models.Meanwhile,we construct different gradient-boosted tree models,including GBDT,XGBoost,LightGBM,and CatBoost.Secondly,the importance of feature contributions and feature variables are calculated using the SHapley Additive exPlanations(SHAP)method.A non-linear relationship between different features and passenger flow distribution is revealed.The experiment from Beijing South to Shanghai Hongqiao shows that all four models accurately predict the distribution.The coefficients of determination for GBDT,XGBoost,LightGBM,and CatBoost in the test set are 0.9664,0.9601,0.9680,and 0.9715 respectively.After optimizing the features,the order of importance in the contribution is as follows:benchmark train,ticket price,travel time,date,day of the week,and train code departure time.The coefficient of determination for the CatBoost-7 model in the validation set after feature optimization is 0.9458.Both the date and the benchmark train show a non-linear positive correlation with the passenger flow distribution prediction,while the travel time shows a non-linear negative correlation.In addition,low travel time,high ticket price and the benchmark train departing exactly at the scheduled departure time positively influence the passenger flow distribution prediction.This study provides valuable insights into the design of high-speed rail passenger transportation services.
关 键 词:铁路运输 客流分布预测 可解释机器学习 列车乘车区段 非线性关系
分 类 号:U293.1[交通运输工程—交通运输规划与管理]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.216.130.198