基于深度强化学习的离散状态转移算法求解柔性作业车间调度问题  

A deep reinforcement learning based on discrete state transition algorithm for solving fuzzy flexible job shop scheduling problem

在线阅读下载全文

作  者:朱家政 王聪[1] 李新凯 董颖超 张宏立[1] ZHU Jiazheng;WANG Cong;LI Xinkai;DONG Yingchao;ZHANG Hongli(School of Electrical Engineering,Xinjiang University,Urumqi 830047,China)

机构地区:[1]新疆大学电气工程学院,乌鲁木齐830047

出  处:《北京航空航天大学学报》2025年第4期1385-1394,共10页Journal of Beijing University of Aeronautics and Astronautics

基  金:国家自然科学基金(52267010,62263030);新疆维吾尔自治区自然科学基金资助项目(2022D01C367,2022D01E33)。

摘  要:柔性作业车间调度问题(FJSP)作为一种在实际生活中应用广泛的调度问题,对其智能算法具有重要价值。为了解决FJSP,以最小化最大完工时间为优化目标,提出了一种基于近端策略优化的离散状态转移算法(DSTA-PPO)。DSTA-PPO具有3个特点:考虑到FJSP需要同时对工序排序、机器分配同时进行调度安排,结合工序编码和机器编码,设计了一种能够充分表达当前调度问题的状态特征;针对工序排序、机器分配设计了多种基于关键路径的搜索操作;通过强化学习的训练,能够有效地引导智能体选择正确的搜索操作优化当前的调度序列。通过基于不同数据集的仿真实验,验证了算法各环节的有效性,同时在相同算例上以最小化最大完工时间为对比指标与现有算法进行了比较,对比结果表明了所提算法能够在多数算例上以更短的完工时间对算例完成求解,有效地求解了柔性作业车间调度问题。The study of the intelligent algorithms for the flexible job shop scheduling issue(FJSP),a scheduling problem with a broad range of application backgrounds,is very relevant both academically and practically.To address FJSP with the objective of minimizing the maximum completion time,this paper proposes a discrete state transfer algorithm based on proximal policy optimization(DSTA-PPO).DSTA-PPO has the following three characteristics.Considering that FJSP requires simultaneous scheduling arrangements for operation sequencing and machine assignment.This state feature can adequately express the current scheduling problem that was designed by combining operation coding and machine coding.Various critical path based search operations have been designed for operation sequencing and machine allocation.Reinforcement learning training is an efficient way to direct intelligence to choose the best search operation to maximize the current scheduling sequence..The effectiveness of each component of the algorithm is verified through simulation experiments on different datasets.Furthermore,a comparison is conducted with existing algorithms using the objective of minimizing the maximum completion time in the same instances.The comparison results show that the suggested method successfully resolves the flexible job shop scheduling issue by typically achieving shorter completion times.

关 键 词:深度学习 强化学习 离散状态转移算法 近端策略优化算法 柔性作业车间调度 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象