Near Optimal Approximations and Finite Memory Policies for POMPDs with Continuous Spaces Dedicated to Professor Peter E. Caines, on the occasion of his 80th birthday  

在线阅读下载全文

作  者:KARA Ali Devran BAYRAKTAR Erhan YUKSEL Serdar 

机构地区:[1]Department of Mathematics,Florida State University,FL 32306-2400,USA [2]Department of Mathematics,University of Michigan,MI 48109,USA [3]Department of Mathematics and Statistics,Queen's University,ON K7L 3N6,Canada

出  处:《Journal of Systems Science & Complexity》2025年第1期238-270,共33页系统科学与复杂性学报(英文版)

基  金:partially supported by the National Science Foundation under Grant No.DMS-2106556;by the Susan M.Smith chair;partially supported by the Natural Sciences and Engineering Research Council(NSERC)of Canada。

摘  要:The authors study an approximation method for partially observed Markov decision processes(POMDPs)with continuous spaces.Belief MDP reduction,which has been the standard approach to study POMDPs requires rigorous approximation methods for practical applications,due to the state space being lifted to the space of probability measures.Generalizing recent work,in this paper the authors present rigorous approximation methods via discretizing the observation space and constructing a fully observed finite MDP model using a finite length history of the discrete observations and control actions.The authors show that the resulting policy is near-optimal under some regularity assumptions on the channel,and under certain controlled filter stability requirements for the hidden state process.The authors also provide a Q learning algorithm that uses a finite memory of discretized information variables,and prove its convergence to the optimality equation of the finite fully observed MDP constructed using the approximation method.

关 键 词:Filter stability POMDP reinforcement learning stochastic control 

分 类 号:O17[理学—数学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象