潜在空间中深度强化学习方法研究综述  

Review of Deep Reinforcement Learning in Latent Space

在线阅读下载全文

作  者:赵婷婷[1] 孙威 陈亚瑞[1] 王嫄 杨巨成[1] ZHAO Tingting;SUN Wei;CHEN Yarui;WANG Yuan;YANG Jucheng(College of Artificial Intelligence,Tianjin University of Science and Technology,Tianjin 300457,China)

机构地区:[1]天津科技大学人工智能学院,天津300457

出  处:《计算机科学与探索》2023年第9期2047-2074,共28页Journal of Frontiers of Computer Science and Technology

基  金:国家自然科学基金(61976156);天津市企业科技特派员项目(20YDTPJC00560)。

摘  要:深度强化学习(DRL)是实现通用人工智能的一种有效学习范式,已在一系列实际应用中取得了显著成果。然而,DRL存在泛化性能差、样本效率低等问题。基于深度神经网络的表示学习通过学习环境的底层结构,能够有效缓解上述问题。因此,基于潜在空间的深度强化学习成为该领域的主流方法。系统地综述了基于潜在空间的表示学习在深度强化学习中的研究进展,分析并总结了现有基于潜在空间的深度强化学习的方法,将其分为潜在空间中的状态表示、动作表示以及动力学模型进行详细阐述。其中,潜在空间中的状态表示又被分为基于重构方式的状态表示方法、基于互模拟等价的状态表示方法及其他状态表示方法。最后,列举了现有基于潜在空间的强化学习在游戏领域、智能控制领域、推荐领域及其他领域的成功应用,并浅谈了该领域的未来发展趋势。Deep reinforcement learning(DRL)is an effective learning paradigm to realize general artificial intelligence,and has achieved remarkable achievements in a series of real-world applications.However,deep reinforcement learning has some challenges,such as generalization capability and sample efficiency.Representation learning based on deep neural networks can effectively alleviate the above problems by learning the underlying structure of the environment.Therefore,latent space based deep reinforcement learning has become the popular method in this field.A systematic review is conducted on the research progress of representation learning based on latent space in deep reinforcement learning.Existing methods of deep reinforcement learning based on latent space are analyzed and summarized,and they are categorized into state representation,action representation,and dynamics model in the latent space.Within the state representation in the latent space,it is further divided into methods based on reconstruction,methods based on mutual imitation equivalence,and other state representation methods.Finally,successful applications of deep reinforcement learning based on latent space in areas such as gaming,intelligent control,recommendation systems,and other domains are presented,followed by a brief discussion on the future development trends in this field.

关 键 词:强化学习 深度学习 潜在空间 状态表示 动作表示 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象