检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:崔文华 李东 唐宇波[1] 柳少军[1] CUI Wenhua;LI Dong;TANG Yubo;LIU Shaojun(National Defense University, Beijing 100091, China)
机构地区:[1]国防大学,北京100091
出 处:《国防科技》2020年第2期113-121,共9页National Defense Technology
摘 要:针对兵棋推演的自动对抗问题,文章提出基于深度学习网络和强化学习模型来构建对抗策略。文章结合深度强化学习技术优势,立足多源层次化的战场态势描述,提出面向智能博弈的战场态势表示方法;将作战指挥分层分域的原则同即时策略游戏中的模块化和分层架构相结合,提出一种层次化和模块化深度强化学习方法框架,用于各决策智能体与战场环境交互的机制以及对抗策略的产生;为满足实际作战响应高实时特点,提出压缩的深度强化学习,提升模型输出速度;为改善对不同环境的适应性,提出利用深度迁移学习提升模型泛化能力。In order to solve the problem of automatic confrontation in wargaming,this paper puts forward a countering strategy based on a deep learning network and a reinforcement learning model.Combined with the advantages of deep reinforcement learning and multi-source hierarchical battlefield situation description,this paper proposes a battlefield situation representation method.A hierarchical and modular deep reinforcement learning framework is then proposed,by combining the principle of hierarchical and domain command with the modular and layered architecture of deep reinforcement learning in real-time strategy games,and applied to the interaction mechanism between decision agents and battlefield environment as well as the formulation of countering strategies.Considering the characteristics of high real-time operational response,a compressed deep reinforcement learning method is proposed to accelerate the output speed of the model.In order to improve the adaptability to different environments,a deep transfer learning method is also proposed to improve the generalization ability of the model.
关 键 词:兵棋推演 深度强化学习 态势表示 压缩学习方法 深度迁移学习
分 类 号:TP18[自动化与计算机技术—控制理论与控制工程] E91[自动化与计算机技术—控制科学与工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.173