检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:谭云洁 倪静[1] Yunjie Tan;Jing Ni(Business School,University of Shanghai for Science and Technology,Shanghai)
机构地区:[1]上海理工大学管理学院,上海
出 处:《建模与仿真》2025年第3期694-715,共22页Modeling and Simulation
摘 要:异构车辆路径规划(HCVRP)是当前物流优化领域的核心难题,由于车辆容量与速度的差异化约束、客户需求的动态分布以及多目标优化的内在冲突。针对现有方法存在的特征融合效率、计算复杂度与多目标协同能力上的不足,本文提出了一种基于深度强化学习的多特征融合框架(MFF-HVPP)。通过马尔可夫决策过程(MDP)建模动态路径规划问题,构建包含车辆状态与节点状态的复合状态空间,并设计双模态奖励函数以适配min-max与min-sum目标。同时,构建多特征融合编码器,通过位置嵌入提取节点局部特征与空间依赖,并设计了Transformer通道特征扩展模块(TransCFE),通过通道维度的特征增强与残差连接,解决传统注意力机制中的梯度消失与过拟合问题。分层解码策略中,MFF-HVPP将路径规划解耦为车辆选择与节点选择的序列决策过程,通过结合概率化抽样实现了全局优化与局部搜索的平衡。实验表明,在120客户节点的min-max场景中,MFF-HVPP最大行程时间误差率(Gap)低至1.31%,计算效率较传统方法提升98%;在min-sum任务中,总行程时间优化误差率仅为1.07%,并且支持百节点级场景的实时响应。本文研究为复杂约束下的多目标路径规划提供了可扩展的理论框架,并为智能物流系统的动态调度奠定了技术基础。The heterogeneous vehicle routing problem(HCVRP)is a core challenge in the field of logistics optimization,due to the differentiated constraints of vehicle capacity and speed,dynamic distribu-tion of customer demand,and inherent conflicts in multi-objective optimization.In response to the limitations of existing methods in feature fusion efficiency,computational complexity,and multi-objective coordination ability,this paper proposes a multi-feature fusion framework based on deep reinforcement learning(MFF-HVPP).By modeling the dynamic routing problem using a Markov De-cision Process(MDP),we construct a composite state space that includes vehicle and node states,and design a dual-modal reward function to accommodate the min-max and min-sum objectives.At the same time,a multi-feature fusion encoder is developed,which extracts local node features and spatial dependencies through position embeddings.We also propose a Transformer channel fea-ture extension module(TransCFE),which enhances features along the channel dimension and uses residual connections to address the issues of gradient vanishing and overfitting found in traditional attention mechanisms.In the hierarchical decoding strategy,MFF-HVPP decouples the routing de-cision process into vehicle selection and node selection as a sequential decision process,achieving a balance between global optimization and local search through probabilistic sampling.Experiments show that in a min-max scenario with 120 customer nodes,the MFF-HVPP achieves a maximum travel time gap of just 1.31%,with computational efficiency improved by 98%compared to traditional methods.In the min-sum task,the total travel time optimization gap is only 1.07%,and it supports real-time responses in scenarios with up to 100 nodes.This research provides a scalable theoretical framework for multi-objective routing under complex constraints and lays the technical foundation for dynamic scheduling in intelligent logistics systems.
关 键 词:车辆路径规划 深度强化学习 多特征融合 通道特征扩展 分层解码器
分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.90