检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:郭洪宇 初阳[1] 刘志[1] 周玉芳[1] GUO Hong-yu;CHU Yang;LIU Zhi;ZHOU Yu-fang(Jiangsu Automation Research Institute, Lianyungang 222061, China)
出 处:《指挥控制与仿真》2022年第1期103-111,共9页Command Control & Simulation
摘 要:潜艇和水面舰艇编队间的攻防对抗是潜艇作战研究的重点内容,如何确保潜艇在舰艇编队、反潜直升机等兵力的联合封锁下存活和突围,是对潜艇指挥决策的考验。为此,针对潜舰机博弈对抗场景,从深度强化学习和规则推理两个方面构建潜艇智能体,提出两种近端策略优化(Proximal Policy Optimization,PPO)算法改进机制,开展互博弈对抗和分布式训练,最终实现潜艇在对抗过程中的智能决策,相关技术路线和算法在兵棋对战平台上得到实施和验证,算法改进后的收敛速度和稳定性有了较大提升,可为潜艇智能指挥决策的研究提供技术参考。The offensive and defensive confrontation between the submarine and the surface ship formation is the key content of submarine combat research.How to ensure that the submarine survives and breaks through the joint blockade of the ship formation and anti-submarine helicopters is a test of the submarine command decision.To this end,in view of the asymmetry of the submarine-ship-helicopter confrontation scenario,the submarine agent is constructed from two aspects of deep reinforcement learning and rule inference,and two Proximal Policy Optimization(PPO)algorithm improvement mechanisms are proposed.It carries out mutual game confrontation and distributed training,and finally realizes the intelligent decision-making of submarines in the confrontation process.Related technical routes and algorithms have been implemented and verified on the wargaming platform.The improved algorithm has greatly improved the convergence speed and stability.The research on submarine intelligent command decision-making provides technical reference.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.226