检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:刘家义 王刚[2] 贾晨星 付强[2] 明月伟 LIU Jiayi;WANG Gang;JIA Chenxing;FU Qiang;MING Yuewei(Joint Operations College,National Defense University,Shijiazhuang 050084,China;Air Defense and Antimissile School,Air Force Engineering University,Xi’an 710051,China)
机构地区:[1]国防大学联合作战学院,石家庄050084 [2]空军工程大学防空反导学院,西安710051
出 处:《空军工程大学学报》2025年第1期104-110,共7页Journal of Air Force Engineering University
基 金:国家自然科学基金(62106283)。
摘 要:现代信息化战争中,战场环境复杂多变,具有高动态、不完全信息和不确定性等特点,深度强化学习为其中的任务分配问题提供了新思路。针对智能体在不确定场景中泛化能力不足的问题,提出了面向不确定场景的多决策风格智能体架构,增强了智能体面对不确定复杂环境的适应能力;针对深度强化学习方法中单一奖励函数很难训练出符合人类决策逻辑的智能体问题,提出了基于事件的奖励机制,合理引导智能体学习;最后在数字战场仿真环境中验证了所提方法的可行性和优越性。The battlefield environments being complex,dynamic,characterized by high dynamics,incomplete information,and uncertainty,the deep reinforcement learning(DRL)is enabled to provide a new way of thinking about task assignment in modern information warfare.Aimed at the problem that the agent system is inadequate in generalization ability under condition of uncertain scenario,this paper proposes an event-based reward mechanism to reasonably guide the learning of the agent,and the problem that in deep reinforcement learning,a single reward function is difficult to train an agent of being in keeping with human decision logic,this paper proposes an event-based reward mechanism to reasonably guide the learning of the agent.And this paper proposes a multi-agent architecture for different decision styles,enhancing the ability of the agent to adapt to complex environments.Finally,the feasibility and superiority of the proposed method are verified on a digital battlefield.
分 类 号:TP391.9[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.147