检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:苏莹莹[1] 王宛山[1] 王建荣[1] 唐亮[1]
机构地区:[1]东北大学机械工程与自动化学院,辽宁沈阳110004
出 处:《东北大学学报(自然科学版)》2009年第2期279-282,共4页Journal of Northeastern University(Natural Science)
基 金:教育部高等学校博士学科点专项科研基金资助项目(20060145017)
摘 要:在任务分配问题中,如果Markov决策过程模型的状态-动作空间很大就会出现"维数灾难".针对这一问题,提出一种基于BP神经网络的增强学习策略.利用BP神经网络良好的泛化能力,存储和逼近增强学习中状态-动作对的Q值,设计了基于Q学习的最优行为选择策略和Q学习的BP神经网络模型与算法.将所提方法应用于工艺任务分配问题,经过Matlab软件仿真实验,结果证实了该方法具有良好的性能和行为逼近能力.该方法进一步提高了增强学习理论在任务分配问题中的应用价值.Aiming at the curse of dimensionality caused by prodigiousness of state-action space for Markov decision-making process model, a kind of Q learning method based on neural network was proposed. The Q value of a state-action pair during reinforcement learning was approached and stored by means of the high generalizability of BP neural network, then the optimal strategy based on Q learning for selection of action and a BP neural network model and algorithm for Q learning were designed. The algorithm proposed was applied to task allocation of process planning, with a simulation done by the software Matlab. The result indicated that it has a good performance and the capability of action approach, and the method enhances the applicability of reinforcement learning in task allocation.
分 类 号:TH164[机械工程—机械制造及自动化]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.236