基于Stackelberg策略的多Agent强化学习警力巡逻路径规划被引量：4

Police Patrol Path Planning Using Stackelberg Equilibrium Based Multiagent Reinforcement Learning

出　　处：《北京理工大学学报》2017年第1期93-99,共7页Transactions of Beijing Institute of Technology

基　　金：中国人民公安大学基本科研业务费项目(2014JKF01132)

摘　　要：为解决现有的巡逻路径规划算法仅仅能够处理双人博弈和忽略攻击者存在的问题,提出一种新的基于多agent的强化学习算法.在给定攻击目标分布的情况下,规划任意多防御者和攻击者条件下的最优巡逻路径.考虑到防御者与攻击者选择策略的非同时性,采用了Stackelberg强均衡策略作为每个agent选择策略的依据.为了验证算法,在多个巡逻任务中进行了测试.定量和定性的实验结果证明了算法的收敛性和有效性.The patrol path planning has been simplified with state-of-art algorithm into twoperson game in grid world,ignoring the existence of attackers.In order to deal with the problem of realistic patrol path planning,a novel multi-agent reinforcement learning algorithm was proposed.An optimum patrol path was planned in a circumstance that multiple defenders and attackers formed the multi-target configuration.Considering the asynchronism of the actions taken by many defender and attacker,a strong Stackelberg equilibrium was taken as the action selection of players in the proposed algorithm.To verify the proposed algorithm,several patrol missions were tested.The qualitative and quantitative test results prove the convergence and effectiveness of the algorithm.

关键词：巡逻路线规划 Stackelberg强均衡策略多AGENT 强化学习

分类号：TP399[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于Stackelberg策略的多Agent强化学习警力巡逻路径规划被引量：4

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于Stackelberg策略的多Agent强化学习警力巡逻路径规划 被引量：4

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于Stackelberg策略的多Agent强化学习警力巡逻路径规划被引量：4