检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:刘菁 华翔[1,2] 张金金[1] LIU Jing;HUA Xiang;ZHANG Jinjin(School of Defence Science and Technology,Xi’an Technological University,Xi’an 710021,China;School of Electronic Information Engineering,Xi’an Technological University,Xi’an 710021,China)
机构地区:[1]西安工业大学兵器科学与技术学院,西安710021 [2]西安工业大学电子信息工程学院,西安710021
出 处:《西安工业大学学报》2023年第3期277-286,共10页Journal of Xi’an Technological University
基 金:陕西省重点研发计划项目(2023 YBGY 227);西安市科技计划项目(2022JH RYFW 0138)。
摘 要:针对无人机集群对单智能化目标协同围捕问题,文中提出一种改进博弈学习的无人机集群协同围捕方法。根据集群和目标的运动学关系,建立了一种结合博弈论与阿波罗尼斯圆的协同围捕模型;依据集群之间的相互合作关系和追逃双方的博弈关系,基于Q Learning算法和学习到的奖赏均值动态调整贪婪因子以构建和完善状态动作矩阵;根据状态动作矩阵求解支付矩阵的纳什均衡解,完成集群对单目标的协同围捕。研究结果表明:通过该协同围捕方法各围捕无人机获得的平均奖赏值较传统Q Learning算法分别提高了48%,32.4%,50.8%,完成围捕任务所需的平均行走步数减少了58.7%,能够有效对单目标进行围捕,且围捕时效性更强。In response to cooperative hunting of a single intelligent target by UAV swarm,the paper presents a cooperative hunting method based on improved game learning.According to the kinematic relationship between the swarm and the target,a cooperative hunting model is established based on game theory and Apollonis.In accordance with the cooperative relationship between the swarms and the game relationship between the chasing and escaping parties,the greed factor is dynamically adjusted based on the Q Learning algorithm and the learned reward mean so as to construct and perfect the state action matrix.According to the Nash equilibrium solution which is obtained by solving the payment matrix based on the state action matrix,the cooperative hunting of a single target is completed by the swarm.The simulation results show that the average reward value obtained by each UAV by this collaborative hunting method is 48%,32.4%,and 50.8%higher than that by the conventional Q Learning algorithm,respectively,and that the average number of walking steps required to complete a roundup task is reduced by 58.7%.It is concluded that the cooperative hunting method can effectively capture a single target with higher time efficiency.
关 键 词:无人机集群 协同围捕 博弈论 阿波罗尼斯圆 Q Learning
分 类 号:TP242[自动化与计算机技术—检测技术与自动化装置]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.28