检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:黄晓辉[1] 凌嘉壕 张雄 熊李艳[1] 曾辉[1] HUANG Xiaohui;LING Jiahao;ZHANG Xiong;XIONG Liyan;ZENG Hui(School of Information Engineering,East China Jiaotong University,Nanchang 330013,China)
出 处:《计算机工程与应用》2023年第7期294-301,共8页Computer Engineering and Applications
基 金:国家自然科学基金(62062033,62067002);江西省自然科学基金面上项目(20212BAB202008)。
摘 要:近年来,网上约车成为人们日常出行不可或缺的一部分。网约车平台的核心任务是如何有效地把订单派送给合适的司机,使得用户总体等待时间尽可能短,而司机的收益尽可能高。在目前的研究中,主要采用贪心算法以及强化学习来构建模型。但当前方法大都只考虑乘客的即时满意度,未能有效地考虑车辆、订单之间相对位置关系,从长远的角度来降低全体乘客的等待时间。为此,将订单派送构建为一个马尔可夫过程,提出了一种基于局部位置感知的多智能体的车辆调度方法。该方法通过设计合适的输入状态和卷积神经网络来捕捉人与车的时空关系,从长远角度来降低乘客的总体等待时间。实验结果表明,在不同规格的地图、不同数量的车辆和订单的场景中,提出的方法均优于现有的研究方法,并且拥有更好的泛化能力。特别是在大规模人车环境的复杂场景中,该方法所取得的结果要明显优于现有方法。In recent years,online car-hailing has become an indispensable part of people’s daily travel.The core task of the online car-hailing platform is how to effectively dispatch the order to the appropriate driver,so that the overall waiting time of users is as short as possible,and the driver’s revenue is as high as possible.In the current research,greedy algorithms and reinforcement learning are mainly used to build models.However,current methods mostly only consider the immediate satisfaction of passengers,and fail to effectively consider the relative position relationship between vehicles and orders,and reduce the waiting time of all passengers from a long-term perspective.For this reason,this paper constructs order dispatch as a Markov process,and proposes a multi-agent vehicle dispatch method based on local position perception.This method captures the space-time relationship between people and vehicles by designing appropriate input states and convolutional neural networks,and reduces the overall waiting time of passengers from a long-term perspective.Experimental results show that in scenarios with different specifications of maps,different numbers of vehicles and orders,the method proposed is superior to existing methods and has better generalization capabilities.Especially in large-scale human-vehicle environments.the results obtained by the method are significantly better than the existing methods.
关 键 词:多智能体强化学习 车辆调度 局部感知 深度强化学习
分 类 号:TP391.9[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.133.122.83