检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]中南大学信息科学与工程学院,湖南长沙410083 [2]先进控制与智能自动化湖南省工程实验室,湖南长沙410083 [3]深圳职业技术学院教育技术与信息中心,广东深圳518055 [4]School of Computer Science,Tokyo University of Technology
出 处:《软件学报》2013年第1期25-36,共12页Journal of Software
基 金:国家自然科学基金(61074058;60874042);国家教育部博士点基金(20090162120068)
摘 要:部分可观察马尔可夫决策过程(partially observable Markov decision processes,简称POMDPs)是动态不确定环境下序贯决策的理想模型,但是现有离线算法陷入信念状态"维数灾"和"历史灾"问题,而现有在线算法无法同时满足低误差与高实时性的要求,造成理想的POMDPs模型无法在实际工程中得到应用.对此,提出一种基于点的POMDPs在线值迭代算法(point-based online value iteration,简称PBOVI).该算法在给定的可达信念状态点上进行更新操作,避免对整个信念状态空间单纯体进行求解,加速问题求解;采用分支界限裁剪方法对信念状态与或树进行在线裁剪;提出信念状态结点重用思想,重用上一时刻已求解出的信念状态点,避免重复计算.实验结果表明,该算法具有较低误差率、较快收敛性,满足系统实时性的要求.Partially observable Markov decision processes (POMDPs) provide a rich framework for sequential decision-making in stochastic domains of uncertainty. However, solving POMDPs is typically computationally intractable because the belief states of POMDPs have two curses: Dimensionality and history, and online algorithms that can not simultaneously satisfy the requirement of low errors and high timeliness. In order to address these problems, this paper proposes a point-based online value iteration (PBOVI) algorithm for POMDPs. This algorithm for speeding up POMDPs solving involves performing value backup at specific reachable belief points, rather than over the entire a belief simplex. The paper exploits branch-and-bound pruning approach to prune the AND/OR tree of belief states online and proposes a novel idea to reuse the belief states that have been computed last time to avoid repeated computation. The experiment and simulation results show that the proposed algorithm has its effectiveness in reducing the cost of computing policies and retaining the quality of the policies, so it can meet the requirement of a real-time system.
关 键 词:部分可观察马尔可夫决策过程 信念状态 基于点的算法 在线算法 与或树
分 类 号:TP301[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.104