基于Stackelberg策略的多Agent强化学习警力巡逻路径规划  被引量:4

Police Patrol Path Planning Using Stackelberg Equilibrium Based Multiagent Reinforcement Learning

在线阅读下载全文

作  者:解易[1] 顾益军[1] 

机构地区:[1]中国人民公安大学网络安全保卫学院,北京100038

出  处:《北京理工大学学报》2017年第1期93-99,共7页Transactions of Beijing Institute of Technology

基  金:中国人民公安大学基本科研业务费项目(2014JKF01132)

摘  要:为解决现有的巡逻路径规划算法仅仅能够处理双人博弈和忽略攻击者存在的问题,提出一种新的基于多agent的强化学习算法.在给定攻击目标分布的情况下,规划任意多防御者和攻击者条件下的最优巡逻路径.考虑到防御者与攻击者选择策略的非同时性,采用了Stackelberg强均衡策略作为每个agent选择策略的依据.为了验证算法,在多个巡逻任务中进行了测试.定量和定性的实验结果证明了算法的收敛性和有效性.The patrol path planning has been simplified with state-of-art algorithm into twoperson game in grid world,ignoring the existence of attackers.In order to deal with the problem of realistic patrol path planning,a novel multi-agent reinforcement learning algorithm was proposed.An optimum patrol path was planned in a circumstance that multiple defenders and attackers formed the multi-target configuration.Considering the asynchronism of the actions taken by many defender and attacker,a strong Stackelberg equilibrium was taken as the action selection of players in the proposed algorithm.To verify the proposed algorithm,several patrol missions were tested.The qualitative and quantitative test results prove the convergence and effectiveness of the algorithm.

关 键 词:巡逻路线规划 Stackelberg强均衡策略 多AGENT 强化学习 

分 类 号:TP399[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象