基于深度强化学习的工序交互式智能体Job shop调度方法  被引量:2

Interactive Operation Agent Scheduling Method for Job Shop Based on Deep Reinforcement Learning

在线阅读下载全文

作  者:陈睿奇 黎雯馨 王传洋[1] 杨宏兵[1] CHEN Ruiqi;LI Wenxin;WANG Chuanyang;YANG Hongbing(School of Mechanical and Electric Engineering,Soochow University,Suzhou 215000;School of Management,Shanghai University,Shanghai 200230)

机构地区:[1]苏州大学机电工程学院,苏州215100 [2]上海大学管理学院,上海200230

出  处:《机械工程学报》2023年第12期78-88,共11页Journal of Mechanical Engineering

基  金:国家自然科学基金资助项目(52075354)。

摘  要:针对作业车间调度问题(Job shop scheduling problem, JSSP)因NP-难属性难以快速获得优质解,以及生产场景随机扰动所导致的频繁重调度等求解难题,基于深度强化学习提出一种新颖的交互式工序智能体(Interactive operation agent, IOA)调度模型框架。在分析工序间工艺路线和加工设备约束关系的基础上,将Job shop的加工工序构建为工序智能体,设计工序智能体间的交互机制,智能体依据彼此关系进行特征交互并更新自身的特征向量,并基于工序特征和最早加工时间设计拟合动作值函数的深度神经网络,调度模型根据系统状态和工序智能体特征即可生成调度策略。采用Double DQN算法训练IOA调度模型,引入经验回放机制消除序列训练样本间的相关性,训练好的模型可以快速生成高质量的调度方案,并在机器发生故障时能够有效执行重调度策略。试验结果表明所提出的IOA调度方法优于贪婪算法和启发式调度规则,且具有良好鲁棒性和泛化能力。Job shop scheduling problem(JSSP)is difficult to obtain high-quality solution quickly due to NP hard attribute,and rescheduling occurs frequently due to the random disturbances of production scenarios.Based on deep reinforcement learning,a novel interactive operation agent(IOA)scheduling model framework is proposed.Through analysis of the constraint relationship between process route and processing equipment among operations,the processing processes in job shop are constructed as operation agents.The interaction mechanism between operation agents is designed,and each agent can interact with each other and update its own feature vector according to their relationship.Further,a deep neural network is constructed based on the operation characteristics and the earliest processing time to fit the action value function.As a result,the scheduling model can generate the scheduling strategy according to the system state and the characteristics of each operation agent.Double DQN algorithm is used to train IOA scheduling model,and the introduction of empirical playback mechanism effectively breaks the correlation between sequence training samples.The trained model can quickly generate high-quality scheduling scheme,and effectively execute rescheduling production strategy in case of machine failure.Experimental results show that the proposed IOA scheduling method is superior to greedy algorithm and heuristic scheduling rules,and has good robustness and generalization ability.

关 键 词:Job shop调度 深度强化学习 工序智能体 机器故障 double DQN算法 

分 类 号:TH166[机械工程—机械制造及自动化]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象