基于强化学习的空间机器人在轨装配柔顺控制方法  

Compliant Control Method for On-orbit Assembly by Space Robots Based on Reinforcement Learning

在线阅读下载全文

作  者:武黎明 王明明[1,2] 罗建军 张大羽[3] 梁澄汐 WU Liming;WANG Mingming;LUO Jianjun;ZHANG Dayu;LIANG Chengxi(National Key Laboratory of Aerospace Flight Dynamics,Northwestern Polytechnical University,Xi’an 710072,China;Development Institute of Northwestern Polytechnical University in Shenzhen,Shenzhen 518057,China;China Academy of Space Technology(Xi’an),Xi’an 710100,China)

机构地区:[1]西北工业大学航天飞行动力学技术重点实验室,陕西西安710072 [2]西北工业大学深圳研究院,广东深圳518057 [3]中国空间技术研究院西安分院,陕西西安710100

出  处:《机器人》2025年第2期239-248,258,共11页Robot

基  金:国家自然科学基金(U2013206,U24B2001);广东省基础与应用基础研究基金(2024A1515011178).

摘  要:为应对空间机器人在轨装配中装配对象振动、机器人-结构动力学耦合问题,改善现有方法参数调整困难和控制性能差等不足,提出一种结合阻抗控制与深度强化学习的模型-数据混合驱动方法,实现空间机器人装配策略的高效学习。首先,建立拼接式空间望远镜的模块化在轨装配场景,分析自由漂浮空间机器人和装配对象之间的动力学耦合现象,并将模块装配问题描述为马尔可夫决策过程。然后,以空间机器人关节阻抗控制作为先验模型,构建基于深度强化学习的装配策略学习方法,以应对空间机器人和装配对象之间的动力学耦合效应。最后,使用近端策略优化算法,完成装配策略的学习。为实现装配策略学习方法的快速验证,使用Isaac Gym软件构建了空间机器人在轨装配的并行化训练与测试环境。通过仿真与分析,验证了所提方法在提高柔顺控制性能和应对不确定性时的鲁棒性。To address challenges such as vibration of assembly components,dynamic coupling between robots and structures in on-orbit assembly by space robots,as well as difficulties in parameter tuning,and suboptimal control performance in existing methods,a model-data hybrid driving approach is proposed that integrates impedance control with deep reinforcement learning to enable efficient learning of assembly strategies.Firstly,a modular on-orbit assembly scenario for a segmented space telescope is established,and the dynamic coupling between the free-floating space robot and the assembly components is analyzed.The modular assembly task is then formulated as a Markov decision process.Subsequently,joint impedance control of the space robot is introduced as a prior model,and a deep reinforcement learning-based assembly strategy learning method is developed to address the dynamic coupling effects between space robot and assembly components.Finally,the proximal policy optimization(PPO)algorithm is employed to learn the assembly strategy.To facilitate rapid validation of the proposed assembly strategy learning method,a parallelized training and testing environment for on-orbit assembly by space robots is constructed using Isaac Gym.Simulations and analyses demonstrate the proposed method’s effectiveness in improving compliant control performance and robustness against uncertainty.

关 键 词:在轨装配 空间机器人 柔顺控制 深度强化学习 并行训练 

分 类 号:G63[文化科学—教育学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象