检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:周国峰 严大卫[2] 梁卓 ZHOU Guofeng;YAN Dawei;LIANG Zhuo(College of Aerospace Engineering,Nanjing University of Aeronautics and Astronautics,Nanjing 210016,China;China Academy of Launch Vehicle Technology,Beijing 100076,China)
机构地区:[1]南京航空航天大学航空学院,南京210016 [2]中国运载火箭技术研究院,北京100076
出 处:《中国惯性技术学报》2022年第1期135-140,共6页Journal of Chinese Inertial Technology
基 金:装备发展领域基金(41412050)。
摘 要:冲压发动机飞行器爬升过程中发动机性能随飞行状态时变,且易受动力性能偏差、气动偏差和风干扰的耦合影响,传统的方法难以给出能量最优的爬升段轨迹解。针对该问题,提出了一种基于强化学习的轨迹优化控制方法。首先构建了基于近端策略优化(PPO)的强化学习任务模型,将轨迹优化问题转化为基于状态给出最优动作策略的强化学习问题,提出了对未到达目标区域样本赋予广义距离奖励的方法来解决奖励稀疏性问题;通过在控制器训练中引入初值采样来降低初值敏感性;提出了将线性扩张状态观测器(LESO)与强化学习相结合的方法,通过对干扰进行观测和补偿提升控制器抗干扰能力。仿真结果表明,采用所提出的算法后,终端约束误差缩小了60%,可为复杂环境下的冲压发动机轨迹优化控制提供参考。In the process of ramjet aircraft climbing, the engine performance varies with the flight state, and is susceptible to the coupling effects of power performance, aerodynamic and wind. Therefore, it is difficult to obtain the optimal energy trajectory solution by traditional methods. To solve the problem, a trajectory optimization control method based on reinforcement learning is proposed. Firstly, a reinforcement learning model based on proximal policy optimization(PPO) is constructed, which transforms the trajectory optimization problem into a state-based reinforcement learning problem with optimal action strategy, and a generalized distance reward method is proposed to solve the problem of reward sparsity. The sensitivity of initial value is reduced by introducing initial value sampling in training. A method combining linear extended state observer(LESO) with reinforcement learning is proposed to improve the anti-jamming ability by observing and compensating the interference. Simulation results show that the terminal state accuracy is improved by 60% by using the proposed algorithm, which can provide a reference for ramjet trajectory optimization control in complex environments.
关 键 词:冲压发动机 轨迹优化 强化学习 线性扩张状态观测器
分 类 号:V279[航空宇航科学与技术—飞行器设计]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.28