检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:陈灼 周翼骅 普俊松 张秀山[1] 张典[2] CHEN Zhuo;ZHOU Yihua;PU Junsong;ZHANG Xiushan;ZHANG Dian(School of Electronic Engineering,Naval University of Engineering,Wuhan 430033,China;School of Cyber Science and Engineering,Wuhan University,Wuhan 430072,China)
机构地区:[1]海军工程大学电子工程学院,湖北武汉430033 [2]武汉大学国家网络安全学院,湖北武汉430072
出 处:《软件导刊》2025年第1期15-20,共6页Software Guide
基 金:湖北省自然科学基金项目(2022CFB012)。
摘 要:传统强化学习方法泛化性不强,在一些特定任务下直接应用效果往往很差,尤其是在敌我双方博弈的场景下,态势更加复杂。为了解决该问题,提出基于强化学习的双方博弈策略训练方法,并在其基础上提出基于强化学习与规则库增强的双方群体博弈策略训练方法。经过实验验证,该方法显著提升了智能体的行为决策能力,智能体所得到的总奖励值接近14.5。在模拟抓捕任务中,其行为决策得到了有效的优化和改进。同时,通过不同规则库的设置,增加了模拟环境的不确定性,更好地模拟了真实环境的复杂性。The traditional reinforcement learning methods lack strong generalization,often performing poorly when directly applied to specific tasks,especially in scenarios involving adversarial two-player games where the situation is more complex.To address this issue,this paper proposes a reinforcement learning-based strategy training method for two-player games.Furthermore,it introduces a method that enhances two-party group game strategies based on both reinforcement learning and a rule library.Through experimental validation,the proposed meth‐ods enhance the behavioral decision-making of intelligent agents,with the total reward obtained by the intelligent agent approaching 14.5.This has resulted in effective behavioral decision improvements in simulated capture tasks.Simultaneously,by configuring different rule librar‐ies,the method introduces uncertainty into the simulated environment,better simulating the complexity of real-world environments.
分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15