基于深度强化学习的城市公共交通票价优化模型被引量：2

Optimization model of urban public transport ticket fare based on deep reinforcement learning

作　　者：李雪岩张汉坤李静[3] 邱荷婷 LI Xueyan;ZHANG Hankun;LI Jing;QIU Heting(School of management,Beijing union university,Beijing 100101,China;School of e-commerce and logistics,Beijing Technology and Business University,Beijing 100048,China;School of Economics and Management,Beijing Jiaotong University,Beijing 100044,China;School of Management and Mngineering,Capital University of Economics and Business,Beijing 100070,China)

机构地区：[1]北京联合大学管理学院,北京100101 [2]北京工商大学电商与物流学院,北京100048 [3]北京交通大学经济管理学院,北京100044 [4]首都经济贸易大学管理工程学院,北京100070

出　　处：《管理工程学报》2022年第6期144-155,共12页Journal of Industrial Engineering and Engineering Management

基　　金：国家自然科学基金资助青年项目(72103019);教育部人文社会科学研究青年基金资助项目(20YJC630069);北京联合大学智慧北京关键技术攻关学课群项目(ZB10202002)。

摘　　要：本文针对出行需求演化复杂性,首先将票价的优化过程视为智能体在复杂环境中经过不断探索获得最优价格的学习过程;其次引入深度强化学习算法,采用价值函数神经网络拟合出行需求(环境)对票价制定(动作)的反应函数,在不同运输方式间的博弈过程中通过对票价调节动作的奖惩训练其达到决策目标;然后在群体出行决策复杂性刻画方面,基于Logit模型、累积前景理论及Bush Mosteller模型,设计了3种由简单到复杂的出行需求演化场景;最后以现实场景下地铁和公交之间的票价博弈为例,通过数值模拟考察方法的有效性。研究发现:(1)深度强化学习算法在感知出行需求演化复杂性过程中具有良好的票价弹性刻画能力;(2)深度强化学习算法能够针对复杂出行需求给出合理稳定的价格方案,优化地铁(公交)票价后,使不同出行需求模型下地铁(公交)的利润及各出行方式的总体利润均得到显著增长。Fare setting is an important method of regulating passenger flow and relieving congestion in urban transportation systems.In existing fare optimization research,travel demand is mostly described as completely rational under the theory of general equilibrium.Under the condition of complex group decision-making,the relationship between travel demand and ticket fares is transformed into multi-dimensional complex nonlinear feedback.Therefore,the simplification of a traveler group′s complex decision-making process will lead to the deviation of the effectiveness of fare setting.Focusing on the complexity of travel demand evolution,the optimization of fare price is regarded as an agent′s learning process of obtaining the optimal price through continuous exploration in a complex environment.The value function neural network of deep reinforcement learning is introduced to fit the response function between travel demand(environment)and fare setting(action).In the game process among different travel modes,the decision objective can be achieved by training the fare adjustment actions with rewards and punishments.In terms of group decision complexity,three evolution scenarios of travel demand from simple to complex are separately established based on the traditional logit model,cumulative prospect theory,and the Bush-Mosteller model.Numerical simulation is conducted based on the scenario of the fare game between a subway and bus to verify the validity of the new methodology.In the first part of this study,in terms of the fare decision-making,the traditional bi-level programming method is improved,the travel modes are divided into objective travel mode and other travel modes,and the representative Deep Q Learning methodology(DQN)is introduced to optimize the ticket fare of the objective travel mode.In the new model,the fare adjustment strategy of the objective travel mode between origin-destination pair is set as the action variable of Deep Q Learning,the evolutionary passenger flow of different travel modes after fare adj

关键词：深度强化学习公共交通票价制定群体决策

分类号：U9[交通运输工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于深度强化学习的城市公共交通票价优化模型被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于深度强化学习的城市公共交通票价优化模型 被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于深度强化学习的城市公共交通票价优化模型被引量：2