检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:倪炜霖 刘佳琪[2] 邵节 刘鹏 梁海朝 NI Weilin;LIU Jiaqi;SHAO Jie;LIU Peng;LIANG Haizhao(School of Aeronauties,Sun Yat-sen University,Shenzhen,518000;Beijing Institute of Space Long March Vehicle,Beijing,100076)
机构地区:[1]中山大学航空航天学院,深圳518000 [2]北京航天长征飞行器研究所,北京100076
出 处:《导弹与航天运载技术(中英文)》2025年第1期65-72,共8页Missiles and Space Vehicles
基 金:国家自然科学基金(No.62003375)。
摘 要:针对飞行器与伴飞防御飞行器协同躲避拦截器攻击的主动反拦截博弈对抗问题,基于深度强化学习算法提出一种飞行器主动防御智能制导方法,该方法具有在目标飞行器机动能力不足情况下博弈成功率较高的特点。针对强化学习训练过程中的稀疏奖励问题,提出了一种奖励函数塑造方法,提高了强化学习算法收敛效率和训练稳定度。最后,通过数值仿真对所提出方法的有效性进行验证,仿真结果表明,所提出的方法能够实现飞行器博弈对抗成功,且相比于传统博弈制导方法具有更高的博弈成功率。Aiming at the active anti-interception game confrontation between hypersonic aircraft and accompanying defense aircraft to avoid interceptor attacks,an active defense intelligent guidance method for hypersonic aircraft is proposed based on deep reinforcement learning algorithm.In the case of insufficient maneuverability of the target aircraft,this method can achieve a higher success rate.Aiming at the sparse reward problem in the reinforcement learning training process,a reward function shaping method is proposed,which improves the convergence efficiency and training stability of the reinforcement learning algorithm.Finally,the effectiveness of the proposed method is verified by numerical simulation.The simulation results show that the proposed method can successfully achieve flight vehicle game confrontation,and has a higher game success rate than traditional game guidance methods.
关 键 词:博弈对抗 深度强化学习 奖励函数塑造 稀疏奖励 主动反拦截
分 类 号:V11[航空宇航科学与技术—人机与环境工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.147