检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:武黎明 王明明[1,2] 罗建军 张大羽[3] 梁澄汐 WU Liming;WANG Mingming;LUO Jianjun;ZHANG Dayu;LIANG Chengxi(National Key Laboratory of Aerospace Flight Dynamics,Northwestern Polytechnical University,Xi’an 710072,China;Development Institute of Northwestern Polytechnical University in Shenzhen,Shenzhen 518057,China;China Academy of Space Technology(Xi’an),Xi’an 710100,China)
机构地区:[1]西北工业大学航天飞行动力学技术重点实验室,陕西西安710072 [2]西北工业大学深圳研究院,广东深圳518057 [3]中国空间技术研究院西安分院,陕西西安710100
出 处:《机器人》2025年第2期239-248,258,共11页Robot
基 金:国家自然科学基金(U2013206,U24B2001);广东省基础与应用基础研究基金(2024A1515011178).
摘 要:为应对空间机器人在轨装配中装配对象振动、机器人-结构动力学耦合问题,改善现有方法参数调整困难和控制性能差等不足,提出一种结合阻抗控制与深度强化学习的模型-数据混合驱动方法,实现空间机器人装配策略的高效学习。首先,建立拼接式空间望远镜的模块化在轨装配场景,分析自由漂浮空间机器人和装配对象之间的动力学耦合现象,并将模块装配问题描述为马尔可夫决策过程。然后,以空间机器人关节阻抗控制作为先验模型,构建基于深度强化学习的装配策略学习方法,以应对空间机器人和装配对象之间的动力学耦合效应。最后,使用近端策略优化算法,完成装配策略的学习。为实现装配策略学习方法的快速验证,使用Isaac Gym软件构建了空间机器人在轨装配的并行化训练与测试环境。通过仿真与分析,验证了所提方法在提高柔顺控制性能和应对不确定性时的鲁棒性。To address challenges such as vibration of assembly components,dynamic coupling between robots and structures in on-orbit assembly by space robots,as well as difficulties in parameter tuning,and suboptimal control performance in existing methods,a model-data hybrid driving approach is proposed that integrates impedance control with deep reinforcement learning to enable efficient learning of assembly strategies.Firstly,a modular on-orbit assembly scenario for a segmented space telescope is established,and the dynamic coupling between the free-floating space robot and the assembly components is analyzed.The modular assembly task is then formulated as a Markov decision process.Subsequently,joint impedance control of the space robot is introduced as a prior model,and a deep reinforcement learning-based assembly strategy learning method is developed to address the dynamic coupling effects between space robot and assembly components.Finally,the proximal policy optimization(PPO)algorithm is employed to learn the assembly strategy.To facilitate rapid validation of the proposed assembly strategy learning method,a parallelized training and testing environment for on-orbit assembly by space robots is constructed using Isaac Gym.Simulations and analyses demonstrate the proposed method’s effectiveness in improving compliant control performance and robustness against uncertainty.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.43