检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:赵琳 吕科 郭靖 宏晨 向贤财 薛健 王泳 ZHAO Lin;LYU Ke;GUO Jing;HONG Chen;XIANG Xiancai;XUE Jian;WANG Yong(School of Engineering Science,University of Chinese Academy of Sciences,Beijing 100049,China;College of Electronic and Information Engineering,Shenyang Aerospace University,Shenyang Liaoning 110136,China;College of Robotics,Beijing Union University,Beijing 100101,China;School of Artificial Intelligence,University of Chinese Academy of Sciences,Beijing 100049,China)
机构地区:[1]中国科学院大学工程科学学院,北京100049 [2]沈阳航空航天大学电子信息工程学院,沈阳110136 [3]北京联合大学机器人学院,北京100101 [4]中国科学院大学人工智能学院,北京100049
出 处:《计算机应用》2023年第11期3641-3646,共6页journal of Computer Applications
基 金:国家重点研发计划项目(2018AAA0100804)。
摘 要:在无人机(UAV)集群攻击地面目标时,UAV集群将分为两个编队:主攻目标的打击型UAV集群和牵制敌方的辅助型UAV集群。当辅助型UAV集群选择激进进攻或保存实力这两种动作策略时,任务场景类似于公共物品博弈,此时合作者的收益小于背叛者。基于此,提出一种基于深度强化学习的UAV集群协同作战决策方法。首先,通过建立基于公共物品博弈的UAV集群作战模型,模拟智能化UAV集群在合作中个体与集体间的利益冲突问题;其次,利用多智能体深度确定性策略梯度(MADDPG)算法求解辅助UAV集群最合理的作战决策,从而以最小的损耗代价实现集群胜利。在不同数量UAV情况下进行训练并展开实验,实验结果表明,与IDQN(Independent Deep QNetwork)和ID3QN(Imitative Dueling Double Deep Q-Network)这两种算法的训练效果相比,所提算法的收敛性最好,且在4架辅助型UAV情况下胜率可达100%,在其他UAV数情况下也明显优于对比算法。When the Unmanned Aerial Vehicle(UAV)cluster attacks ground targets,it will be divided into two formations:a strike UAV cluster that attacks the targets and a auxiliary UAV cluster that pins down the enemy.When auxiliary UAVs choose the action strategy of aggressive attack or saving strength,the mission scenario is similar to a public goods game where the benefits to the cooperator are less than those to the betrayer.Based on this,a decision method for cooperative combat of UAV clusters based on deep reinforcement learning was proposed.First,by building a public goods game based UAV cluster combat model,the interest conflict problem between individual and group in cooperation of intelligent UAV clusters was simulated.Then,Muti-Agent Deep Deterministic Policy Gradient(MADDPG)algorithm was used to solve the most reasonable combat decision of the auxiliary UAV cluster to achieve cluster victory with minimum loss cost.Training and experiments were performed under conditions of different numbers of UAV.The results show that compared to the training effects of two algorithms—IDQN(Independent Deep Q-Network)and ID3QN(Imitative Dueling Double Deep Q-Network),the proposed algorithm has the best convergence,its winning rate can reach 100%with four auxiliary UAVs,and it also significantly outperforms the comparison algorithms with other UAV numbers.
关 键 词:无人机 多集群 公共物品博弈 多智能体深度确定性策略梯度 协同作战决策方法
分 类 号:V279.2[航空宇航科学与技术—飞行器设计]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.149.4.109