基于有效实例的改进U树算法  

A modified U-tree algorithm based on effective instances

在线阅读下载全文

作  者:宋佳佳[1] 王作为 SONG Jia-jia;WANG Zuo-wei(School of Computer Science and Software Engineering,Tianjin Polytechnic University,Tianjin 300387,China)

机构地区:[1]天津工业大学计算机与软件学院,天津300387

出  处:《计算机工程与科学》2019年第1期185-190,共6页Computer Engineering & Science

摘  要:传统U-Tree算法对于部分观测马尔可夫决策过程POMDP问题的解决已取得较为显著的成效,但是由于边缘节点生长过于随意,所以仍存在树的规模庞大、内存需求比较大、计算复杂度过高的问题。在原U-Tree算法的基础上,通过得到下一步观测值,来划分同一个叶子节点中做相同动作的实例,提出了一种基于有效实例来扩展边缘节点的EIU-Tree算法,大大缩减了计算规模,以此来帮助智能体更好更快地学习,并且在4×3经典栅格问题中做了仿真实验,对比于原有的U-Tree算法,该算法运行效果更好。The traditional U-tree algorithm has achieved remarkable results in solving the problem of partially observable Markov decision process (POMDP), however, because of excessive random growth of fringe nodes, some problems such as large scale trees, large memory requirement and high computational complexity, still remain. Based on the original U-Tree algorithm, we classify the instances of the same leaf node which do the same action after obtaining the observation value, and propose an effective instance U-tree algorithm which extends fringe nodes based on effective instances. It greatly reduces computational scale to help the agent to learn faster and better. Simulation experiments are carried out on the classic 4×3 grid problem, and experimental results show that the algorithm outperforms the original u-Tree algorithm.

关 键 词:部分观测马尔可夫决策过程 强化学习 U-树 Q-学习算法 

分 类 号:TP393[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象