报酬无界的连续时间折扣马氏决策规划被引量：2

Continuous Time Markov Decision Processes with Unbounded Rewards under the Discounted Criterion

机构地区：[1]云南大学,昆明650091 [2]昆明工学院,昆明650093

出　　处：《应用概率统计》1997年第1期1-10,共10页Chinese Journal of Applied Probability and Statistics

基　　金：云南省应用基础研究基金

摘　　要：本文讨论报酬函数无界，转移速率族一致有界，状态空间和行动集均可数的连续时间折扣马氏决策规划（CTMDP）．文中引入了一类新的无界报酬函数，并在一新的马氏策略类中，证明了有界报酬下成立的所有结果；讨论了最优策略的结构，得到了该模型策略为最优的一个充要条件．This paper investigates the continuous time Markov decision processes with discounted criterion.Here, the state spacc and the action set are countable, the reward functions are unbounded,and the transition rates are uniformly bounded. A new condition about the unbounded rewards ispresented. In a new set of Markov policies, what is true under bounded rewards has been provedis eaually ture under unbounded rewards. Through the study of the intrinsic structures of optimalplicies, a condition necessary and sulflicient for optinal policies is first worked out.

关键词：马氏决策规划无界报酬折扣准则 CTMDP

分类号：O211.62[理学—概率论与数理统计]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

报酬无界的连续时间折扣马氏决策规划被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

报酬无界的连续时间折扣马氏决策规划 被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

报酬无界的连续时间折扣马氏决策规划被引量：2