神经网络架构轻量化搜索的飞行器控制律自学习方法

Flight Vehicle Control Law Self-Learning Approach Based on Lightweight Search of Neural Network Architecture

作　　者：王昭磊王露荻路坤锋禹春梅[1,2] 李晓敏林平[1,2] WANG Zhaolei;WANG Ludi;LU Kunfeng;YU Chunmei;LI Xiaomin;LIN Ping(National Key Laboratory of Science and Technology on Aerospace Intelligence Control,Beijing 100854,China;Beijing Aerospace Automatic Control Institute,Beijing 100854,China)

机构地区：[1]宇航智能控制技术全国重点实验室,北京100854 [2]北京航天自动控制研究所,北京100854

出　　处：《宇航学报》2024年第5期762-769,共8页Journal of Astronautics

基　　金：国家自然科学基金(U21B2028)。

摘　　要：针对在运用Soft actor-critic(SAC)强化学习算法实现复杂的飞行器控制律自学习过程中,超参数设定高度依赖于人工经验进而造成设计难度大的问题,提出一种基于神经网络架构轻量化搜索策略的飞行器控制律自学习方法。该方法在将神经网络架构设计问题转化为图拓扑生成问题的基础上,结合LSTM循环神经网络的图拓扑生成算法、基于权重共享的深度强化学习参数轻量化训练与评估机制,以及基于策略梯度的图拓扑生成器参数学习算法,给出了一种面向深度强化学习的轻量化自动搜索框架,实现了SAC训练算法中神经网络架构超参数的自动优化,进而完成了控制律的自学习。以三维空间返回着陆控制为例,验证了所提方法的有效性和实用性。In the context of utilizing soft actor-critic(SAC) reinforcement learning algorithms to realize self-learning in complex flight vehicle control laws,a significant challenge lies in the heavy reliance on manual expertise for hyperparameter tuning,which considerably increases design complexity.A flight vehicle control law self-learning method based on neural network architecture lightweight search strategy is proposed to address this issue.On the basis of transforming the neural network architecture design problem into a graph topology generation problem,this method combines the graph topology generation algorithm of LSTM recurrent neural network,the lightweight training and evaluation mechanism of deep reinforcement learning parameters based on weight sharing,and the strategy gradient based graph topology generator parameter learning algorithm to provide a lightweight automatic search framework for deep reinforcement learning,which automatically optimizes the hyperparameters of the neural network architecture in the SAC training algorithm and completes self-learning of the control law.Taking the three-dimensional space return landing control as an example,the effectiveness and practicality of the proposed method are verified.

关键词：飞行器控制律自学习自动机器学习网络架构搜索 SAC强化学习

分类号：TP273[自动化与计算机技术—检测技术与自动化装置]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

神经网络架构轻量化搜索的飞行器控制律自学习方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

神经网络架构轻量化搜索的飞行器控制律自学习方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索