基于深度强化学习的跨单元动态调度方法  

Intercell Dynamic Scheduling Method Based on Deep Reinforcement Learning

在线阅读下载全文

作  者:倪静[1] 马梦珂 Ni Jing;Ma Mengke(University of Shanghai for Science and Technology,Shanghai 200093,China)

机构地区:[1]上海理工大学,上海200093

出  处:《系统仿真学报》2023年第11期2345-2358,共14页Journal of System Simulation

基  金:教育部人文社会科学基金(19YJAZH064)。

摘  要:为解决加工任务动态到达的跨单元调度问题,使其能够在智能车间复杂多变的环境中实现自适应调度,提出一种基于深度Q网络的调度方法。构建以单元为节点,工件跨单元加工路径为有向边的复杂网络,引入度值定义了具有跨单元调度特征的状态空间。设计了由工件层、单元层和机器层组成的复合调度规则,分层优化使调度方案更加全局化。针对DDQN(double deep Q networks)在训练后期还会选择次优动作的问题,提出了以指数函数为主体的搜索策略。通过不同规模的仿真实验,验证了所提方法能够应对多变的动态环境,快速生成较优的调度方案。In order to solve the intercell scheduling problem of dynamic arrival of machining tasks and realize adaptive scheduling in the complex and changeable environment of the intelligent factory,a scheduling method based on a deep Q network is proposed.A complex network with cells as nodes and workpiece intercell machining path as directed edges is constructed,and the degree value is introduced to define the state space with intercell scheduling characteristics.A compound scheduling rule composed of a workpiece layer,unit layer,and machine layer is designed,and hierarchical optimization makes the scheduling scheme more global.Since double deep Q network(DDQN)still selects sub-optimal actions in the later stage of training,a search strategy based on the exponential function is proposed.Through simulation experiments of different scales,it is verified that the proposed method can deal with the changeable dynamic environment and quickly generate an optimal scheduling scheme.

关 键 词:跨单元调度 动态调度 强化学习 度值 复合规则 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程] TP391[自动化与计算机技术—控制科学与工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象