融合强化学习与变邻域搜索的柔性作业车间调度研究  被引量:2

Reinforcement Learning-based Variable Neighborhood Search Algorithm for Flexible Job Shop Problem

在线阅读下载全文

作  者:米恬怡 唐秋华 成丽新 张利平 MI Tianyi;TANG Qiuhua;CHENG Lixin;ZHANG Liping(Key Laboratory of Metallurgical Equipment and Control Technology,Wuhan University of Science and Technology,Wuhan,Hubei 430081,China;Hubei Key Laboratory of Mechanical Transmission and Manufacturing Engineering,Wuhan University of Science and Technology,Wuhan,Hubei 430081,China)

机构地区:[1]武汉科技大学冶金装备及其控制教育部重点实验室,湖北武汉430081 [2]武汉科技大学机械传动与制造工程湖北省重点实验室,湖北武汉430081

出  处:《工业工程与管理》2023年第5期101-107,共7页Industrial Engineering and Management

基  金:国家自然科学基金资助项目(51875421,51875420)。

摘  要:针对柔性作业车间调度问题,以完工时间最小化为目标,提出一种融合强化学习的变邻域搜索算法,提升算法求解性能。基于皮尔逊相关性分析,提炼出工序加工时长这一关键特征,设计一种优先考虑加工时长的邻域结构,精炼搜索空间。基于强化学习,设计算法进化状态集、关键参数动作集和奖励机制。提出改进的ε-贪婪策略来选择动作,随着ε取值的自适应变化,算法前期倾向于探索新解,后期注重利用邻域解,最终构建起算法状态与算法参数的关系,实现了算法参数的自适应选择。结果表明,所提算法利用强化学习动态调整算法参数,在解的寻优能力和稳定程度上更具优势。To solve the flexible job shop problem with the objective of minimizing makespan,an improved reinforcement learning-based variable neighborhood search algorithm(IRLVNS) was proposed to promote the efficiency of solution procedure. Processing time which was a key impact factor,was extracted via Pearson's correlation analysis. A novel neighborhood structure was designed to refine the search space. On the basis of the strength of reinforcement learning,the evolution state was precisely designed along with the key parameter action set and the reward mechanism of the algorithm. An improved ε-greedy strategy was designed to select the action. As the value of ε changed,in the early stage the algorithm tended to explore new solutions,then exploited neighbor solutions in the late stage. The relationship between the algorithm state and its parameters was finally constructed to realize the adaptive selection of the algorithm parameters. Exhaustive computational experiments indicate that the algorithm(IRLVNS) proposed adjusts its parameters adaptively and therefore shows significant advantages in terms of near-optimality and robustness of solutions.

关 键 词:强化学习 变邻域搜索 自适应参数选择 柔性作业车间 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象