强化学习求解组合最优化问题的研究综述  被引量:16

Review of Reinforcement Learning for Combinatorial Optimization Problem

在线阅读下载全文

作  者:王扬 陈智斌[1] 吴兆蕊 高远 WANG Yang;CHEN Zhibin;WU Zhaorui;GAO Yuan(Faculty of Science,Kunming University of Science and Technology,Kunming 650000,China)

机构地区:[1]昆明理工大学理学院,昆明650000

出  处:《计算机科学与探索》2022年第2期261-279,共19页Journal of Frontiers of Computer Science and Technology

基  金:国家自然科学基金(11761042)。

摘  要:组合最优化问题(COP)的求解方法已经渗透到人工智能、运筹学等众多领域。随着数据规模的不断增大、问题更新速度的变快,运用传统方法求解COP问题在速度、精度、泛化能力等方面受到很大冲击。近年来,强化学习(RL)在无人驾驶、工业自动化等领域的广泛应用,显示出强大的决策力和学习能力,故而诸多研究者尝试使用RL求解COP问题,为求解此类问题提供了一种全新的方法。首先简要梳理常见的COP问题及其RL的基本原理;其次阐述RL求解COP问题的难点,分析RL应用于组合最优化(CO)领域的优势,对RL与COP问题结合的原理进行研究;然后总结近年来采用RL求解COP问题的理论方法和应用研究,对各类代表性研究所解决COP问题的关键要点、算法逻辑、优化效果进行对比分析,以突出RL模型的优越性,并对不同方法的局限性及其使用场景进行归纳总结;最后提出了四个RL求解COP问题的潜在研究方向。The solution methods for combinatorial optimization problem(COP)have permeated to the fields of artificial intelligence,operations research,etc.With the scale of data increasing and the speed of problem updating being faster,the traditional method of solving the COP is challenged in computational speed,precision and generalization ability.In recent years,reinforcement learning(RL)has been widely used in driverless,industrial automation and other fields,showing strong decision-making and learning ability.Thus,many researchers have strived to use RL to solve COP,which provides a novel method for solving these problems.This paper firstly introduces the common COP problems and the basic principles of RL.Then,this paper elaborates the difficulties of RL in solving COP,analyzes the advantages of RL in combinatorial optimization(CO)field,and studies the principle of the combination of RL and COP.Subsequently,this paper summarizes the theoretical methods and applied researches of solving COP problems utilizing RL in recent years.In order to highlight the superiority of RL model,this paper also compares and analyzes the key points,algorithmic logic and optimization effect of various representative researches in solving COP problem,and sums up the limitations of different methods and their application fields.Finally,this paper proposes four potential research directions.

关 键 词:强化学习(RL) 深度强化学习(DRL) 组合最优化问题(COP) 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程] O22[自动化与计算机技术—控制科学与工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象