检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:陶鑫钰 王艳 纪志成[1,2] TAO Xinyu;WANG Yan;JI Zhicheng(China Key Laboratory of Advanced Process Control for Light Industry Ministry of Education,Jiangnan University,Wuxi 214122,China;School of the Internet of Things Engineering,Jiangnan University,Wuxi 214122,China)
机构地区:[1]江南大学轻工过程先进控制教育部重点实验室,江苏无锡214122 [2]江南大学物联网工程学院,江苏无锡214122
出 处:《智能系统学报》2023年第1期23-35,共13页CAAI Transactions on Intelligent Systems
基 金:国家重点研发计划项目(2018YFB1701903)。
摘 要:由于传统基于固定加工环境的工艺路线制定规则,无法快速响应加工环境的动态变化制定节能工艺路线。因此提出了基于深度Q网络(deep Q network,DQN)的节能工艺路线发现方法。基于马尔可夫决策过程,定义状态向量、动作空间、奖励函数,建立节能工艺路线模型,并将加工环境动态变化的节能工艺路线规划问题,转化为DQN智能体决策问题,利用决策经验的可复用性和可扩展性,进行求解,同时为了提高DQN的收敛速度和解的质量,提出了基于S函数探索机制和加权经验池,并使用了双Q网络。仿真结果表明,相比较改进前,改进后的算法在动态加工环境中能够更快更好地发现节能工艺路线;与遗传算法、模拟退火算法以及粒子群算法相比,改进后的算法不仅能够以最快地速度发现节能工艺路线,而且能得到相同甚至更高精度的解。Due to the traditional process route formulation rules based on the fixed processing environment,it is unable to quickly respond to the dynamic changes of the processing environment to formulate energy-saving process routes.Therefore,an energy-saving process route discovery method based on deep Q network(DQN)is proposed in this paper.Based on the Markov decision process,we define the state vector,action space,and reward function,establish an energy-saving process route model,and transform the energy-saving process route planning problem with dynamic changes in the processing environment into a DQN agent decision-making problem,which uses the reusable and extensible decision-making experience to solve the problem.At the same time,an exploration mechanism based on the S function,a weighted experience pool,and a double-Q network are used to improve the convergence speed and solution quality of DQN.The simulation results show that compared with that before improvement,the improved algorithm can find energy-saving process routes faster and better in the dynamic processing environment;and compared with genetic algorithm,simulated annealing algorithm,as well as particle swarm algorithm,the improved algorithm can not only discover energy-saving process routes at the fastest speed,but also obtain the same or even higher precision solutions.
关 键 词:深度强化学习 深度Q网络 动态加工环境 工艺路线 马尔可夫决策过程 智能体决策 双Q网络 启发式算法
分 类 号:TP273[自动化与计算机技术—检测技术与自动化装置]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49