无界报酬向量值折扣马氏决策规划──最优平稳策略及算法

A Discounted Vector-Valued Markovian Decision Model with Unbounded Rewards── The Optimal Stationary Policies and Algorithm

作　　者：张升[1] 张继红[1]

机构地区：[1]云南大学,昆明工学院

出　　处：《云南大学学报（自然科学版）》1994年第4期299-305,共7页Journal of Yunnan University(Natural Sciences Edition)

摘　　要：本文主要讨论了无界报酬向量模型的平稳策略问题，给出了改进平稳策略的方法，建立起向量模型的最优方程，获得平稳策略为强最优策略的充要条件．指出最优平稳策略的期望报酬函数必为极大不动点，最后提出一种寻求最优平稳策略的策略迭代算法．This paper mainly deals with the problems of stationary policies in themodel discussed in [5].A method improving stationary policy is derived. The necessaryand sufficient condition for a stationary policy to be the strongly optimal policy isobtained. The optimality equation for the vector model is established. It is shown that theexpected return functions of optimal stationary policies are the maximal fixed point definedin this paper. Finally, an iterative agorithm for finding the optimal stationary policies anda numerical example are given.

关键词：折扣马氏决策规划最优平稳策略无界报酬向量

分类号：O22[理学—运筹学与控制论]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

无界报酬向量值折扣马氏决策规划──最优平稳策略及算法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

无界报酬向量值折扣马氏决策规划──最优平稳策略及算法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索