检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:陈文雪 高长生[1] 荆武兴[1] CHEN Wenxue;GAO Changsheng;JING Wuxing(School of Astronautics,Harbin Institute of Technology,Harbin 150001,China)
出 处:《航空学报》2023年第11期277-295,共19页Acta Aeronautica et Astronautica Sinica
基 金:国家自然科学基金(12072090)。
摘 要:针对临近空间高超声速飞行器的高速性、机动性等特性,为提高制导算法针对不同初始状态、不同机动性目标的准确性、鲁棒性及智能性,提出一种基于信赖域策略优化(TRPO)算法的深度强化学习制导算法。基于TRPO算法的制导算法由2个策略(动作)网络、1个评价网络共同组成,将临近空间目标与拦截弹相对运动系统状态以端对端的方式直接映射为制导指令。在算法训练过程中合理选取连续动作空间、状态空间、并通过权衡能量消耗、相对距离等因素构建奖励函数加快其收敛速度,最终依据训练的智能体模型针对不同任务场景进行拦截测试。仿真结果表明:与传统比例导引律(PN)及改进比例导引律(IPN)相比,本文算法针对学习场景及未知场景均具有更小的脱靶量、更稳定的拦截效果、鲁棒性,并能够在多种配置计算机上广泛应用。Considering the characteristics of high speed and maneuverability of hypersonic vehicles in near-space,this paper proposes a deep reinforcement learning guidance algorithm based on the Trust Region Policy Optimization(TRPO)algorithm to improve the accuracy,robustness,and intelligence of the guidance algorithm for intercepting tar⁃gets with different initial states and different maneuverability modes.The guidance algorithm based on the TRPO algo⁃rithm is composed of two policy(action)networks and a critic network,directly mapping the relative motion system state of the near-space target and the interceptor to the guidance command of the interceptor.In the algorithm training process,continuous action space and state space are reasonably designed,and the reward function is constructed to accelerate the training convergence speed by weighing energy consumption,relative distance,and other factors.Fi⁃nally,tests are conducted for different task scenarios according to the trained agent model.The simulation results show that,compared with the traditional Proportional Navigation guidance law(PN)and the Improved Proportional Navigation guidance law(IPN),the guidance algorithm in this paper has smaller miss distances,a more stable inter⁃ception effect,and robustness for learned scenarios and unknown scenarios,and can be widely used on multiple con⁃figuration computers.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.171