基于多智能体深度强化学习的解耦控制方法被引量：1

Decoupling control method based on multi-agent deep reinforcement learning

作　　者：肖钟毓夏钟升洪文晶师佳[1,2] XIAO Zhongyu;XIA Zhongsheng;HONG Wenjing;SHI Jia(College of Chemistry and Chemical Engineering,Xiamen University,Xiamen 361005,China;Gulei Innovation Institute,Xiamen University,Zhangzhou 363123,China)

机构地区：[1]厦门大学化学化工学院,福建厦门361005 [2]厦门大学古雷石化研究院,福建漳州363123

出　　处：《厦门大学学报（自然科学版）》2024年第3期570-582,共13页Journal of Xiamen University：Natural Science

摘　　要：[目的]在现代工业生产过程中,实现复杂非线性多输入多输出系统的解耦控制对于生产过程的操作和优化都具有至关重要的意义.[方法]本文基于多智能体深度确定性策略梯度(MADDPG)算法,提出了一种解决复杂非线性多输入多输出系统解耦控制问题的设计方案,并通过连续搅拌反应过程的解耦控制仿真计算,验证了设计方案的有效性.[结果]验证结果表明:本文所提出的方案能够同时对连续搅拌反应过程中反应温度、产物摩尔流量两个被控量的设定控制目标进行跟踪调节,且在同样的控制目标下,该设计方案比单智能体方案和PID(proportional-integral-derivative control)控制方案都具有更好的稳定性与更小的稳态控制误差.[结论]仿真结果表明:针对复杂非线性多输入多输出系统的解耦控制问题,多智能体强化学习算法能够在不依赖于过程模型的基础上,实现复杂非线性多输入多输出系统的解耦控制,并保证较好的控制性能.[Objective]The main characteristic of modern industrial process is known to couple the nonlinear dynamic with multi-input and multi-output(MIMO).For the purpose of ensuring the optimality and the simplicity of the process control,all key variables of the process must be effectively decoupled in control.However,due to difficulties in process modeling and system analyses,the challenge of achieving this treatment in MIMO systems remains.[Methods]The intelligent optimization algorithm based on multi-agent reinforcement learning(MARL)can attain the policy learning and optimization without relying on any process knowledge.In addition,the control objectives can be set so that each agent can be independently and collaboratively optimized.These futures of MARL can be applied to solve the decoupling control problems of the complex nonlinear MIMO processes.In this paper,a decoupling control system design method based on the MARL algorithm is proposed.In this method,the classical loop gain maximization principle is used to pair the process input and output to form the corresponding decoupled control loops.Each control loop is controlled and optimized by an agent with independent control objective and state feedback information.Based on the objective of each control loop,we design a reward function for the corresponding agent.Finally,the multi-agent deep deterministic policy gradient(MADDPG)algorithm is developed to train agents,resulting in obtaining the optimal decoupling control strategy.[Results]To demonstrate the effectiveness of the proposed design scheme,we model the reaction process in the continuous stirred tank reactor(CSTR)based on Aspen software platform,taking aniline hydrogenation to cyclohexylamine as an example.For this process,to guarantee the quality of the product,we necessarily regulate the residence time of the reactant in the reactor and the reaction temperature according to the optimal production conditions.In other words,two decoupled control loops need to be designed.One controls the reaction temperat

关键词：多智能体强化学习解耦控制深度确定性策略梯度连续搅拌反应器非线性多输入多输出系统

分类号：TP29[自动化与计算机技术—检测技术与自动化装置]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于多智能体深度强化学习的解耦控制方法被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于多智能体深度强化学习的解耦控制方法 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于多智能体深度强化学习的解耦控制方法被引量：1