Markov控制过程基于神经元动态规划的优化算法被引量：1

Optimization Algorithms for Markov Control Processes Using Neuro-dynamic Programming

出　　处：《中国科学技术大学学报》2001年第5期549-557,共9页JUSTC

基　　金：国家自然科学基金 (6 99740 37);国家高性能计算基金 (0 0 2 0 8)资助项目

摘　　要：论文在Markov性能势理论基础上 ,研究了Markov控制过程在神经元网络等逼近结构表示的随机平稳策略作用下的仿真优化算法 ;分析了它们在一个无限长的样本轨道上以概率 1的收敛性 ;并给出了一个三Motivated by the needs of on line optimization of real word engineering systems, single sample path based optimization algorithms were studied for Markov control processes controlled by randomized stationary policies. The concept of Markov performance potential is introduced, and the policies can be represented by some approximate architectures such as neural networks. Unlike traditional computation based approaches, the policy parameters can be iterated and an optimal (or suboptimal) randomized stationary policy can be found according to a sample path obtained by observing the operation of a real system.This optimization method is a form of neuro dynamic programming methodology. The algorithms provided here have good adaptability as they can be used in different real systems, with a suitable choice of the parameters in the algorithms. Finally, the convergence of the algorithms with probability one on an infinite sample path is considered, and a numerical example for a three state controlled Markov chain is provided.

关键词：Markov性能势理论 MARKOV控制过程随机平稳策略样本轨道神经元动态规划随机决策问题

分类号：O231.3[理学—运筹学与控制论] O221.3[理学—数学]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

Markov控制过程基于神经元动态规划的优化算法被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

Markov控制过程基于神经元动态规划的优化算法 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

Markov控制过程基于神经元动态规划的优化算法被引量：1