基于多智能体深度强化学习的解耦控制方法  被引量:1

Decoupling control method based on multi-agent deep reinforcement learning

在线阅读下载全文

作  者:肖钟毓 夏钟升 洪文晶 师佳[1,2] XIAO Zhongyu;XIA Zhongsheng;HONG Wenjing;SHI Jia(College of Chemistry and Chemical Engineering,Xiamen University,Xiamen 361005,China;Gulei Innovation Institute,Xiamen University,Zhangzhou 363123,China)

机构地区:[1]厦门大学化学化工学院,福建厦门361005 [2]厦门大学古雷石化研究院,福建漳州363123

出  处:《厦门大学学报(自然科学版)》2024年第3期570-582,共13页Journal of Xiamen University:Natural Science

摘  要:[目的]在现代工业生产过程中,实现复杂非线性多输入多输出系统的解耦控制对于生产过程的操作和优化都具有至关重要的意义.[方法]本文基于多智能体深度确定性策略梯度(MADDPG)算法,提出了一种解决复杂非线性多输入多输出系统解耦控制问题的设计方案,并通过连续搅拌反应过程的解耦控制仿真计算,验证了设计方案的有效性.[结果]验证结果表明:本文所提出的方案能够同时对连续搅拌反应过程中反应温度、产物摩尔流量两个被控量的设定控制目标进行跟踪调节,且在同样的控制目标下,该设计方案比单智能体方案和PID(proportional-integral-derivative control)控制方案都具有更好的稳定性与更小的稳态控制误差.[结论]仿真结果表明:针对复杂非线性多输入多输出系统的解耦控制问题,多智能体强化学习算法能够在不依赖于过程模型的基础上,实现复杂非线性多输入多输出系统的解耦控制,并保证较好的控制性能.[Objective]The main characteristic of modern industrial process is known to couple the nonlinear dynamic with multi-input and multi-output(MIMO).For the purpose of ensuring the optimality and the simplicity of the process control,all key variables of the process must be effectively decoupled in control.However,due to difficulties in process modeling and system analyses,the challenge of achieving this treatment in MIMO systems remains.[Methods]The intelligent optimization algorithm based on multi-agent reinforcement learning(MARL)can attain the policy learning and optimization without relying on any process knowledge.In addition,the control objectives can be set so that each agent can be independently and collaboratively optimized.These futures of MARL can be applied to solve the decoupling control problems of the complex nonlinear MIMO processes.In this paper,a decoupling control system design method based on the MARL algorithm is proposed.In this method,the classical loop gain maximization principle is used to pair the process input and output to form the corresponding decoupled control loops.Each control loop is controlled and optimized by an agent with independent control objective and state feedback information.Based on the objective of each control loop,we design a reward function for the corresponding agent.Finally,the multi-agent deep deterministic policy gradient(MADDPG)algorithm is developed to train agents,resulting in obtaining the optimal decoupling control strategy.[Results]To demonstrate the effectiveness of the proposed design scheme,we model the reaction process in the continuous stirred tank reactor(CSTR)based on Aspen software platform,taking aniline hydrogenation to cyclohexylamine as an example.For this process,to guarantee the quality of the product,we necessarily regulate the residence time of the reactant in the reactor and the reaction temperature according to the optimal production conditions.In other words,two decoupled control loops need to be designed.One controls the reaction temperat

关 键 词:多智能体强化学习 解耦控制 深度确定性策略梯度 连续搅拌反应器 非线性多输入多输出系统 

分 类 号:TP29[自动化与计算机技术—检测技术与自动化装置]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象