可变折扣马氏决策过程首达模型列的收敛问题

Convergence Problem of a Sequence of First Passage Markov Decision Processes with Varying Discount Factors

作　　者：吴晓郭圳滨 WU Xiao;GUO Zhenbin(School of Mathematics and Statistics,Zhaoqing University,Zhaoqing,526061,China;Development Research Center,GF Securities Co.,Ltd.,Shanghai,200120,China)

机构地区：[1]肇庆学院数学与统计学院,肇庆526061 [2]广发证券股份有限公司发展研究中心,上海200120

出　　处：《应用概率统计》2021年第6期598-610,共13页Chinese Journal of Applied Probability and Statistics

基　　金：国家自然科学基金项目(批准号:11961005);广东省普通高校特色创新类项目基金(批准号:2018KTSCX253)资助.

摘　　要：本文主要研究了可数状态空间上带多约束、可变折扣马氏决策过程首达模型序列的收敛问题.利用``占有测度''及其相关性质,将受约束首达模型序列的优化问题转化为等价的受约束线性规划问题(凸分析方法),在合适条件下证明了首达模型序列的最优值和最优策略收敛于``极限''模型的最优值和最优策略.In this paper,we study the convergence problem of a sequence of first passage Markov decision processes with constraints and varying discount factors.Using the``occupation measures''and its related properties,we transform the constrained optimality problems into linear programming problems on the set of occupation measures(i.e.,the convex analytic approach),and then prove that the optimal values and optimal policies of the original first passage Markov decision processes converge respectively to those of the``limit''one.

关键词：马氏决策过程首达模型多约束依赖状态折扣因子凸分析方法收敛问题

分类号：O211.62[理学—概率论与数理统计]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

可变折扣马氏决策过程首达模型列的收敛问题

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

可变折扣马氏决策过程首达模型列的收敛问题

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索