检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:唐昊[1] 杨羊 戴飞 谭琦[1] TANG Hao;YANG Yang;DAI Fei;TAN Qi(School of Electrical Engineering and Automation, Hefei University of Technology, Hefei 230009, China)
机构地区:[1]合肥工业大学电气与自动化工程学院
出 处:《控制与决策》2019年第7期1456-1462,共7页Control and Decision
基 金:国家自然科学基金项目(61573126,71231004);中央高校基本科研业务费专项基金项目(JZ2016YYPY0052);高等学校博士学科点专项科研基金项目(20130111110007)
摘 要:研究一类多品种工件到达的传送带给料加工站系统(CSPS)的前视距离(Look-ahead)优化控制问题,以提高系统的工作效率。在工件品种数增加的情况下,系统状态规模会呈现指数性增长,考虑传统Q学习在面对大规模离散状态空间所面临的维数灾难,且难以直接处理前视距离为连续化变量的问题,引入了RBF网络来逼近Q值函数,网络的输入为状态行动对,输出为该状态行动对的Q值。给出RBF-Q学习算法,并应用于多品种CSPS系统的优化控制中,实现了连续行动空间的Q学习。针对不同的品种数情况进行仿真分析,仿真结果表明, RBF-Q学习算法可以对多品种CSPS系统性能进行有效优化,并且提高学习速度。This paper studies the look-ahead optimal control problem of the conveyor-serviced production station(CSPS)system for a class of varieties of parts arrival to improve the efficiency of operations. When the number of varieties of the system increases, the system state scale will show exponential growth. Considering the dimension disaster problem of traditional Q-learning in the face of the large-scale discrete state and the difficulty of dealing with the look-ahead as a continuous variable directly, the RBF network is introduced to approximate the Q value function, the input of the RBF network is the state action pair, and the output is the Q value of the state action pair. The RBF-Q learning algorithm is proposed, and applied to the optimal control of multi-type products conveyor-serviced production station, realized the continuous action space Q-learning. The simulation analysis is carried out for different varieties, and results show that the method can effectively optimize the processing of CSPS system and improve the learning speed.
关 键 词:RBF网络 Q学习 多品种工件 传送带给料加工站 前视距离
分 类 号:TP278[自动化与计算机技术—检测技术与自动化装置]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15