切换拓扑下异构集群的强化学习时变编队控制

Time-varying formation control for heterogeneous clusters with switching topologies via reinforcement learning

作　　者：杨加秀李新凯张宏立[1] 王昊 YANG Jiaxiu;LI Xinkai;ZHANG Hongli;WANG Hao(School of Electrical Engineering,Xinjiang University,Urumqi 830017,China)

机构地区：[1]新疆大学电气工程学院,乌鲁木齐830017

出　　处：《航空学报》2024年第10期243-259,共17页Acta Aeronautica et Astronautica Sinica

基　　金：国家自然科学基金(62263030);新疆维吾尔自治区自然科学基金青年科学基金(2022D01C86)。

摘　　要：针对系统模型动态不确定的高阶异构无人集群系统在切换通信拓扑下的时变编队控制问题,提出一种基于积分强化学习的最优分布式分层编队控制方法。结合时变编队切换向量构建多四旋翼无人机系统与多无人车系统的增广系统,将异构集群系统的时变编队控制问题转化为镇定问题。引入带折扣因子的价值函数,将异构集群系统的镇定问题转化为最优控制问题。在不破坏一致性分布式编队控制协议的基础上,仅替换反馈增益参数并对其进行取平均操作,以得到整个异构集群的最优时变编队切换控制协议。利用单网络“动作网络-评价网络”结构,结合积分强化学习算法和分布式控制方法,在线实时更新分布式时变编队切换控制器的反馈增益。通过理论证明和仿真实验验证了所设计控制方案的有效性和优越性。To address the problem of time-varying formation control of high-order heterogeneous unmanned cluster systems with uncertain system model dynamics and switching communication topology,an optimal distributed hierarchical formation control method is proposed based on integral reinforcement learning.The time-varying formation control problem for heterogeneous cluster systems is transformed into a stabilization problem by using time-varying formation switching vectors to construct an augmented system of multi-quadrotor Unmanned Aircraft System(UAS)with multi-unmanned vehicle systems.The value function with discount factor is introduced to transform the stabilization problem of the heterogeneous clustered system into an optimal control problem.Only the feedback gain parameters are replaced and averaged to obtain the optimal time-varying formation switching control protocol for the whole heterogeneous cluster without destroying the consistent distributed formation control protocol.The feedback gain of the distributed time-varying formation switching controller is updated online in real time using a single-network“actor networkcritic network”structure,combined with the integral reinforcement learning algorithm and the distributed control method.The effectiveness and superiority of the proposed control scheme are verified by theoretical proof and simulation experiments.

关键词：积分强化学习异构集群时变编队控制分布式控制切换拓扑最优控制

分类号：V249[航空宇航科学与技术—飞行器设计]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

切换拓扑下异构集群的强化学习时变编队控制

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

切换拓扑下异构集群的强化学习时变编队控制

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索