检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:ZHAO Zhixin CHEN Jie XIN Bin LI Li JIAO Keming ZHENG Yifan
机构地区:[1]School of Electronics and Information Engineering,Tongji University,Shanghai 201804,China [2]Shanghai Institute of Intelligent Science and Technology,Tongji University,Shanghai 201804,China [3]School of Automation,Beijing Institute of Technology,Beijing 100081,China [4]National Key Laboratory of Autonomous Intelligent Unmanned Systems,Beijing 100081,China [5]Shanghai Institute of Intelligent Science and Technology,Tongji University,Shanghai 201804,China.
出 处:《Journal of Systems Science & Complexity》2024年第1期369-388,共20页系统科学与复杂性学报(英文版)
基 金:supported in part by the National Natural Science Foundation of China Basic Science Research Center Program under Grant No.62088101;the National Natural Science Foundation of China under Grant Nos.7217117 and 92367101;the Aeronautical Science Foundation of China under Grant No.2023Z066038001;Shanghai Municipal Science and Technology Major Project under Grant No.2021SHZDZX0100;Chinese Academy of Engineering,Strategic Research and Consulting Program under Grant No.2023-XZ-65.
摘 要:The multi-UAV adversary swarm defense(MUASD)problem is to defend a static base against an adversary UAV swarm by a defensive UAV swarm.Decomposing the problem into task assignment and low-level interception strategies is a widely used approach.Learning-based approaches for task assignment are a promising direction.Existing studies on learning-based methods generally assume decentralized decision-making architecture,which is not beneficial for conflict resolution.In contrast,centralized decision-making architecture is beneficial for conflict resolution while it is often detrimental to scalability.To achieve scalability and conflict resolution simultaneously,inspired by a self-attention-based task assignment method for sensor target coverage problem,a scalable centralized assignment method based on self-attention mechanism together with a defender-attacker pairwise observation preprocessing(DAP-SelfAtt)is proposed.Then,an imperative-priori conflict resolution(IPCR)mechanism is proposed to achieve conflict-free assignment.Further,the IPCR mechanism is parallelized to enable efficient training.To validate the algorithm,a variant of proximal policy optimization algorithm(PPO)is employed for training in scenarios of various scales.The experimental results show that the proposed algorithm not only achieves conflict-free task assignment but also maintains scalability,and significantly improve the success rate of defense.
关 键 词:Conflict resolution reinforcement learning SCALABILITY task assignment.
分 类 号:V279[航空宇航科学与技术—飞行器设计] TP18[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.249