检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]中国人民公安大学网络安全保卫学院,北京100038
出 处:《北京理工大学学报》2017年第1期93-99,共7页Transactions of Beijing Institute of Technology
基 金:中国人民公安大学基本科研业务费项目(2014JKF01132)
摘 要:为解决现有的巡逻路径规划算法仅仅能够处理双人博弈和忽略攻击者存在的问题,提出一种新的基于多agent的强化学习算法.在给定攻击目标分布的情况下,规划任意多防御者和攻击者条件下的最优巡逻路径.考虑到防御者与攻击者选择策略的非同时性,采用了Stackelberg强均衡策略作为每个agent选择策略的依据.为了验证算法,在多个巡逻任务中进行了测试.定量和定性的实验结果证明了算法的收敛性和有效性.The patrol path planning has been simplified with state-of-art algorithm into twoperson game in grid world,ignoring the existence of attackers.In order to deal with the problem of realistic patrol path planning,a novel multi-agent reinforcement learning algorithm was proposed.An optimum patrol path was planned in a circumstance that multiple defenders and attackers formed the multi-target configuration.Considering the asynchronism of the actions taken by many defender and attacker,a strong Stackelberg equilibrium was taken as the action selection of players in the proposed algorithm.To verify the proposed algorithm,several patrol missions were tested.The qualitative and quantitative test results prove the convergence and effectiveness of the algorithm.
关 键 词:巡逻路线规划 Stackelberg强均衡策略 多AGENT 强化学习
分 类 号:TP399[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7