基于AlphaZero的不围棋博弈系统研究  

Research on No Go game system based on AlphaZero

在线阅读下载全文

作  者:高彤彤 丁佳慧 舒文奥 阴思琪 GAO Tongtong;DING Jiahui;SHU Wen′ao;YIN Siqi(Computer School,Beijing Information Science and Technology University,Beijing 100101,China)

机构地区:[1]北京信息科技大学计算机学院,北京100101

出  处:《智能计算机与应用》2022年第11期138-141,147,共5页Intelligent Computer and Applications

摘  要:2017年,谷歌旗下的DeepMind团队公布了AlphaZero,这是人工智能研究的一个重要里程碑,该算法在不需要专家数据的前提下采用自博弈的方式进行训练,适用于多种棋种。本文以不围棋为载体,将AlphaZero算法应用到不围棋博弈系统,较为详细地分析了策略网络、价值网络引导的蒙特卡洛树搜索算法的实现。通过自我对弈学习博弈知识,得到了自我强化,优化了评估函数。In 2017, Google’s DeepMind team announced AlphaZero, which is an important milestone in artificial intelligence research. The algorithm uses self-game training without requiring expert data, which is suitable for a variety of chess games. Taking No Go as a carrier, this paper applies the AlphaZero algorithm to the No Go game system, and analyzes the implementation of the Monte Carlo tree search algorithm guided by the strategy network and the value network in more detail. By learning game knowledge through self-play, self-reinforcing is obtained, and the evaluation function is optimized.

关 键 词:机器博弈 不围棋 自我对弈 神经网络 蒙特卡洛 AlphaZero 策略网络 价值网络 损失函数 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象