结合进化算法的深度强化学习方法研究综述  被引量:13

Survey of Deep Reinforcement Learning Methods with Evolutionary Algorithms

在线阅读下载全文

作  者:吕帅[1,2] 龚晓宇 张正昊 韩帅 张峻伟 LÜShuai;GONG Xiao-Yu;ZHANG Zheng-Hao;HAN Shuai;ZHANG Jun-Wei(College of Computer Science and Technology,Jilin University,Changchun 130012;Key Laboratory of Symbolic Computation and Knowledge Engineering(Jilin University),Ministry of Education,Changchun 130012;Department of Information and Computing Sciences,Utrecht University,Utrecht 3584 CC,The Netherlands)

机构地区:[1]吉林大学计算机科学与技术学院,长春130012 [2]符号计算与知识工程教育部重点实验室(吉林大学),长春130012 [3]Department of Information and Computing Sciences,Utrecht University,Utrecht 3584 CC,The Netherlands

出  处:《计算机学报》2022年第7期1478-1499,共22页Chinese Journal of Computers

基  金:国家重点研发计划(2017YFB1003103);国家自然科学基金(61763003);吉林省自然科学基金(20180101053JC)资助。

摘  要:深度强化学习是目前机器学习领域中重要的研究分支之一,它可以通过直接与环境进行交互实现端到端的学习,对高维度和大规模的问题有着很好的解决能力.虽然深度强化学习已经取得了瞩目的成果,但其仍面临着对环境探索能力不足、鲁棒性差、容易受到由欺骗性奖励导致的欺骗性梯度影响等问题.进化算法普遍具有较好的全局搜索能力、良好的鲁棒性和并行性等优点,因此将进化算法与深度强化学习结合用于弥补深度强化学习不足的方法成为了当前研究的热点.该文主要关注进化算法在无模型的深度强化学习方法中的应用,首先简单介绍了进化算法和强化学习基本方法,之后详细阐述了两类结合进化算法的强化学习方法,分别是进化算法引导策略搜索的强化学习和结合进化算法的深度强化学习,同时对这些方法进行了对比与分析,最后对该领域的研究重点和发展趋势进行了探究.Deep reinforcement learning is one of the most important branches in the field of machine learning,which can achieve end-to-end learning through direct interaction with the environment and is capable of solving high-dimensional and large-scale problems.Although deep reinforcement learning has achieved remarkable results,it still faces problems such as insufficient exploration of the environment,poor robustness,and susceptibility of gradients caused by deceptive rewards.In general,evolutionary algorithms have good global search ability,robustness,parallelism and other advantages.Therefore,the methods combining evolutionary algorithms with deep reinforcement learning to compensate the inadequacy of deep reinforcement learning methods have become a research hotspot recently.This paper focuses on the applications of evolutionary algorithms in model-free deep reinforcement learning methods.We introduce evolutionary algorithms and basic methods of reinforcement learning firstly.After that,we introduce the characteristics,advantages,disadvantages,and applicable tasks of evolutionary algorithms,deep reinforcement learning algorithms,and combined methods of evolutionary algorithms and deep reinforcement learning,showing the necessity of combined methods from a different aspect.Then,two types of reinforcement learning methods with evolutionary algorithms are elaborated,which are reinforcement learning with evolutionary algorithms guided policy search and combination of evolutionary algorithms and deep reinforcement learning.In reinforcement learning with evolutionary algorithms guided policy search methods,we categorize the different policy search methods into parameter distribution search methods,policy gradient approximation methods,and policy population search methods.Parameter distribution search methods regard the parameters of a policy as a distribution and sample the parameters from this distribution to form a new policy.Policy gradient approximation methods use the fitness of the policy as an approximation of the g

关 键 词:强化学习 深度强化学习 进化算法 遗传算法 进化策略 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象