基于深度强化学习的兵棋推演决策方法框架  被引量:15

Framework of wargaming decision-making methods based on deep reinforcement learning

在线阅读下载全文

作  者:崔文华 李东 唐宇波[1] 柳少军[1] CUI Wenhua;LI Dong;TANG Yubo;LIU Shaojun(National Defense University, Beijing 100091, China)

机构地区:[1]国防大学,北京100091

出  处:《国防科技》2020年第2期113-121,共9页National Defense Technology

摘  要:针对兵棋推演的自动对抗问题,文章提出基于深度学习网络和强化学习模型来构建对抗策略。文章结合深度强化学习技术优势,立足多源层次化的战场态势描述,提出面向智能博弈的战场态势表示方法;将作战指挥分层分域的原则同即时策略游戏中的模块化和分层架构相结合,提出一种层次化和模块化深度强化学习方法框架,用于各决策智能体与战场环境交互的机制以及对抗策略的产生;为满足实际作战响应高实时特点,提出压缩的深度强化学习,提升模型输出速度;为改善对不同环境的适应性,提出利用深度迁移学习提升模型泛化能力。In order to solve the problem of automatic confrontation in wargaming,this paper puts forward a countering strategy based on a deep learning network and a reinforcement learning model.Combined with the advantages of deep reinforcement learning and multi-source hierarchical battlefield situation description,this paper proposes a battlefield situation representation method.A hierarchical and modular deep reinforcement learning framework is then proposed,by combining the principle of hierarchical and domain command with the modular and layered architecture of deep reinforcement learning in real-time strategy games,and applied to the interaction mechanism between decision agents and battlefield environment as well as the formulation of countering strategies.Considering the characteristics of high real-time operational response,a compressed deep reinforcement learning method is proposed to accelerate the output speed of the model.In order to improve the adaptability to different environments,a deep transfer learning method is also proposed to improve the generalization ability of the model.

关 键 词:兵棋推演 深度强化学习 态势表示 压缩学习方法 深度迁移学习 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程] E91[自动化与计算机技术—控制科学与工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象