Enhancing the generalization capability of 2D array pointer networks through multiple teacher-forcing knowledge distillation  

在线阅读下载全文

作  者:Qidong Liu Xin Shen Chaoyue Liu Dong Chen Xin Zhou Mingliang Xu 

机构地区:[1]School of Computer Science and Artificial Intelligence,Zhengzhou University,Zhengzhou,450001,China [2]National Supercomputing Center in Zhengzhou,Zhengzhou,450001,China [3]Nanyang Technological University,Nanyang Avenue,639798,Singapore

出  处:《Journal of Automation and Intelligence》2025年第1期29-38,共10页自动化与人工智能(英文)

基  金:in part by the National Science Foundation of China under Grant No.62276238;in part by the National Science Foundation for Distinguished Young Scholars of China under Grant No.62325602;in part by the Natural Science Foundation of Henan,China under Grant No.232300421095.

摘  要:The Heterogeneous Capacitated Vehicle Routing Problem(HCVRP),which involves efficiently routing vehicles with diverse capacities to fulfill various customer demands at minimal cost,poses an NP-hard challenge in combinatorial optimization.Recently,reinforcement learning approaches such as 2D Array Pointer Networks(2D-Ptr)have demonstrated remarkable speed in decision-making by modeling multiple agents’concurrent choices as a sequence of consecutive actions.However,these learning-based models often struggle with generalization,meaning they cannot seamlessly adapt to new scenarios with varying numbers of vehicles or customers without retraining.Inspired by the potential of multi-teacher knowledge distillation to harness diverse knowledge from multiple sources and craft a comprehensive student model,we propose to enhance the generalization capability of 2D-Ptr through Multiple Teacher-forcing Knowledge Distillation(MTKD).We initially train 12 unique 2D-Ptr models under various settings to serve as teacher models.Subsequently,we randomly sample a teacher model and a batch of problem instances,focusing on those where the chosen teacher performed best.This teacher model then solves these instances,generating high-reward action sequences to guide knowledge transfer to the student model.We conduct rigorous evaluations across four distinct datasets,each comprising four HCVRP instances of varying scales.Our empirical findings underscore the proposed method superiority over existing learning-based methods in terms of both computational efficiency and solution quality.

关 键 词:Vehicle routing problem Multi-teacher knowledge distillation Teacher-forcing Pointer network 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象