改进Q学习算法在作业车间调度问题中的应用被引量：4

Application of Improved Q Learning Algorithm in Job Shop Scheduling Problem

作　　者：赵也践王艳红[1] 张俊于洪霞[1] 田中大[1] Zhao Yejian;Wang Yanhong;Zhang Jun;Yu Hongxia;Tian Zhongda(School of Artificial Intelligence,Shenyang University of Technology,Shenyang 110027,China)

机构地区：[1]沈阳工业大学人工智能学院,辽宁沈阳110027

出　　处：《系统仿真学报》2022年第6期1247-1258,共12页Journal of System Simulation

基　　金：国家自然科学基金(61803273);辽宁省重点研发计划(2020JH2/10100041)。

摘　　要：为解决动态环境下作业车间调度问题,提出了一种基于改进Q学习算法和调度规则的动态调度算法。以“剩余任务紧迫程度”的概念来描述动态调度算法的状态空间;设计了以“松弛越高,惩罚越高”为宗旨的回报函数;通过引入以Softmax函数为主体的动作选择策略来改进传统的Q学习算法,使改进后的Q学习算法在前期选择不同动作的概率更加平等,同时改善了贪婪策略在学习后期还会选择次优动作的现象。仿真结果表明:该调度算法相较于改进前,性能指标平均提升约6.5%;相较于IPSO算法和PSO算法,性能指标平均提升分别约为38.3%和38.9%,调度结果明显优于使用单一调度规则以及传统优化算法等常规方法。Aiming at the job shop scheduling in a dynamic environment,a dynamic scheduling algorithm based on an improved Q learning algorithm and dispatching rules is proposed.The state space of the dynamic scheduling algorithm is described with the concept of"the urgency of remaining tasks"and a reward function with the purpose of"the higher the slack,the higher the penalty"is disigned.In view of the problem that the greedy strategy will select the sub-optimal actions in the later stage of learning,the traditional Q learning algorithm is improved by introducing an action selection strategy based on the"softmax"function,which makes the improved Q learning algorithm more equal in the probability of selecting different actions in the early stage.The simulation results obtained from 6 different test instances show that the performance indicator of the scheduling algorithm is improved by an average of about 6.5%compared to the before and by about 38.3%and 38.9% respectively compared with the IPSO algorithm and PSO algorithm.The indicator is significantly better than conventional methods such as using a single dispatching rule and traditional optimization algorithms.

关键词：强化学习 Q学习调度规则动态调度作业车间调度

分类号：TB497[一般工业技术] TP278[自动化与计算机技术—检测技术与自动化装置]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

改进Q学习算法在作业车间调度问题中的应用被引量：4

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

改进Q学习算法在作业车间调度问题中的应用 被引量：4

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

改进Q学习算法在作业车间调度问题中的应用被引量：4