基于强化学习的飞行器轨迹跟踪制导与编队保持问题研究  

Research on Aircraft Standard Trajectory Tracking Guidance and Formation Keeping based on Reinforcement Learning

在线阅读下载全文

作  者:滕庆骅 惠俊鹏 李天任 杨奔 TENG Qinghua;HUI Junpeng;LI Tianren;YANG Ben(Research&Development Center,China Academy of Launch Vehicle Technology,Beijing,100076;Beijing Institute of Space Long March Vehicle,Beijing,100076)

机构地区:[1]中国运载火箭技术研究院研究发展中心,北京100076 [2]北京航天长征飞行器研究所,北京100076

出  处:《导弹与航天运载技术(中英文)》2025年第2期60-68,共9页Missiles and Space Vehicles

摘  要:飞行器的智能化升级对制导能力提出了新的需求,传统算法在有偏差条件下跟踪空间三维轨迹的表现不佳。基于TD3强化学习算法设计了飞行器轨迹跟踪制导方式。通过偏差形式的动作空间、奖励函数中的惩罚项、距离变化率的导引,解决了算法训练难收敛、控制量波动过大、中末交班点偏差累积大等问题。相比传统LQR算法,强化学习制导算法的制导精度、偏差适应性均有较大提升,且具备良好的泛用性,能够应用于小规模编队保持问题。The intelligent upgrade of the aircraft has put forward new requirements for guidance capabilities,and traditional algorithms perform poorly in tracking spatial three-dimensional trajectories under biased conditions.An aircraft trajectory tracking guidance method is designed based on the TD3 reinforcement learning algorithm.Through the action space in the form of deviation,the penalty term in the reward function and the guidance of the rate of change of distance,problems such as difficult convergence of algorithm training,large fluctuations in control quantity,and large cumulative deviation at the middle and final shift points are solved.Compared with the traditional LQR algorithm,the reinforcement learning guidance algorithm has significantly improved guidance accuracy and deviation adaptability,and has good versatility,which can be applied to small-scale formation maintenance issues.

关 键 词:TD3算法 标准轨迹制导 强化学习制导 编队保持 蒙特卡罗仿真 

分 类 号:V448[航空宇航科学与技术—飞行器设计]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象