检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:方伟 接中冰 陆恒杨 张涛[3] FANG Wei;JIE Zhong-bing;LU Heng-yang;ZHANG Tao(International Joint Laboratory on Artificial Intelligence of Jiangsu Province,Jiangnan University,Wuxi 214122,China;Jiangsu Provincial Engineering Laboratory of Pattern Recognition and Computational Intelligence,Jiangnan University,Wuxi 214122,China;China Ship Scientific Research Center,Wuxi 214082,China)
机构地区:[1]江南大学江苏省人工智能国际合作联合实验室,江苏无锡214122 [2]江南大学江苏省模式识别与计算智能工程实验室,江苏无锡214122 [3]中国船舶科学研究中心,江苏无锡214082
出 处:《控制与决策》2024年第4期1160-1166,共7页Control and Decision
基 金:国家自然科学基金项目(62073155,62002137,62106088,62206113);船舶总体性能创新研究开放基金项目(22422213)。
摘 要:覆盖旅行商问题(covering salesman problem,CSP)是旅行商问题的变体,在防灾规划、急救管理中有着广泛应用.由于传统方法求解问题实例耗时严重,近年来深度神经网络被提出用于解决该类组合优化问题,在求解速度和泛化性上有明显的优势.现有基于深度神经网络求解CSP的方法求解质量较低,特别在大规模实例上与传统的启发式方法相比存在较大差距.针对上述问题,提出一种新的基于深度强化学习求解CSP的方法,由编码器对输入特征进行编码,提出新的Mask策略对解码器使用自注意力机制构造解的过程进行约束,并提出多起点策略改善训练过程、提高求解质量.实验结果表明,所提方法对比现有基于深度神经网络的求解方法进一步缩小了最优间隙,同时有着更高的样本效率,在不同规模和不同覆盖类型的CSP中展现出更强的泛化能力,与启发式算法相比在求解速度上有10~40倍的提升.The covering salesman problem(CSP)is a variant of the traveling salesman problem,which is widely used in disaster planning and emergency management.Since the traditional solvers are time-consuming for solving problem instances,deep neural networks have been proposed for solving this type of combinatorial optimization problem in recent years,which have obvious advantages in terms of solving speed and generalization.However,existing methods based on deep neural networks for solving CSP problems have low solution quality,especially in large-scale instances,compared with traditional heuristics.Therefore,we propose a new method based on deep reinforcement learning to solve the CSP problem,in which the encoder encodes the input features,propose a new Mask strategy to constrain the decoder to construct a solution using the self-attention mechanism,and propose a multi-start strategy to improve the training process and the solution quality.Experimental results show that the proposed method further reduces the optimality gap compared with existing deep neural network-based solution methods,has higher sample efficiency,shows stronger generalization ability in CSP tasks of different sizes and coverage types,and has a 10-40 times improvement in solution speed compared with heuristic algorithms.
关 键 词:覆盖旅行商 深度强化学习 组合优化 多起点 Mask策略
分 类 号:TP399[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.200