supported by Open Fund/Postdoctoral Fund of the Laboratory of Cognition and Decision Intelligence for Complex Systems,Institute of Automation,Chinese Academy of Sciences,China(No.CASIA-KFKTXDA27040809).
With the breakthrough of AlphaGo,deep reinforcement learning has become a recognized technique for solving sequential decision-making problems.Despite its reputation,data inefficiency caused by its trial and error lea...
In this paper,the pursuit-evasion game with state and control constraints is solved to achieve the Nash equilibrium of both the pursuer and the evader with an iterative self-play technique.Under the condition where th...
National Natural Science Foundation of China(No.61906197).
With the breakthrough of AlphaGo,human-computer gaming AI has ushered in a big explosion,attracting more and more researchers all over the world.As a recognized standard for testing artificial intelligence,various hum...
National Natural Science Foundation of China,Grant/Award Number:62003267;Fundamental Research Funds for the Central Universities,Grant/Award Number:G2022KY0602;Technology on Electromagnetic Space Operations and Applications Laboratory,Grant/Award Number:2022ZX0090;Key Core Technology Research Plan of Xi'an,Grant/Award Number:21RGZN0016。
Aiming at addressing the problem of manoeuvring decision-making in UAV air combat,this study establishes a one-to-one air combat model,defines missile attack areas,and uses the non-deterministic policy Soft-Actor-Crit...
National Key Research and Development Program of China(2017YFB1002503);Science and Technology Innovation 2030-“New Generation Artificial Intelligence”Major Project(2018AAA0100902),China.
Solving the optimization problem to approach a Nash Equilibrium point plays an important role in imperfect information games,e.g.,StarCraft and poker.Neural Fictitious Self-Play(NFSP)is an effective algorithm that lea...
A promising approach to learn to play board games is to use reinforcement learning algorithms that can learn a game position evaluation function. In this paper we examine and compare three different methods for genera...