基于DP-SAMQ行为树的智能体决策模型研究被引量：2

Research on Agent Decision Model Based on Multi-Step Q-Learning Behavior Tree

作　　者：陈妙云王雷[1] 丁治强 CHEN Miao-yun;WANG Lei;DING Zhi-qiang(School of Information Science and Technology,University of Science and Technology of China,Hefei Anhui 230031,China)

机构地区：[1]中国科学技术大学信息科技学院,安徽合肥230031

出　　处：《计算机仿真》2021年第2期301-307,共7页Computer Simulation

基　　金：中科院创新基金(高技术项目CXJJ-17-M139);中科院重大专项课题(KGFZD-135-18-027)。

摘　　要：在多智能体仿真中使用行为树进行决策具有直观、易扩展等优点,但行为树的设计过程过于复杂,人工调试时效率低下。引入Q-Learning来实现行为树的自动设计。为解决传统Q-Learning的收敛速度慢的问题,将模拟退火算法中的Metropolis准则应用到动作选择策略中,随着学习过程自适应改变次优动作的选择概率以及将动态规划思想应用到Q值更新策略。实验结果证明,基于改进的多步Q-Learning行为树的智能体决策模型具有更快的收敛速度,并且能够实现行为树的自动设计和优化。The use of behavior tree for decision-making in multi-agent simulation is intuitive and easy to expand,but the design process of behavior tree is complex and the efficiency of manual debugging is low.The paper introduced Q-Learning to realize the automatic design of behavior tree.In order to solve the problem of slow convergence speed of traditional Q-Learning,a simulated annealing algorithm was used to improve the action selection strategy of multi-step Q-learning,which reduces the probability of non-optimal action selection,and a dynamic programming algorithm was used to update Q value function in reverse order.The experimental results show that the agent based on the improved Q-Learning behavior tree has faster decision-making speed,and can achieve automatic scheduling while reducing the use of conditional nodes,and get more reasonable behavior decision.

关键词：多智能体行为树模拟退火动态规划用动态规划和模拟退火改进的多步Q学习

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于DP-SAMQ行为树的智能体决策模型研究被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于DP-SAMQ行为树的智能体决策模型研究 被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于DP-SAMQ行为树的智能体决策模型研究被引量：2