检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张柄汉 王琛 彭兆涛 张夷斋 张帆[2] ZHANG Binghan;WANG Chen;PENG Zhaotao;ZHANG Yizhai;ZHANG Fan(School of Engineering and Machinery,Chang’an University,Xi’an 710054,China;School of Astronautics,Northwestern Polytechnic University,Xi’an 710072,China)
机构地区:[1]长安大学工程机械学院,西安710054 [2]西北工业大学航天学院,西安710072
出 处:《宇航学报》2023年第12期1934-1943,共10页Journal of Astronautics
基 金:国家自然科学基金(62173275,62222313)。
摘 要:针对空间非合作目标清除任务中的目标适应性以及俘获动作规划复杂性等问题,提出了一种基于强化学习方法并结合“多臂分组协同”机制的包络俘获策略。首先构建了多臂俘获机构的物理模型和运动学模型,之后利用SAC(soft actor-critic)算法并引入前演训练(PT)设计了强化学习控制器,接着基于“多臂分组协同”奖励机制设计奖励函数以训练得到最优俘获动作。为了验证俘获策略对单目标作业的高效性和对多目标作业的高适应性,对各种目标分别进行仿真实验。仿真结果表明:所得的俘获策略可以对多种构型的目标实现高效、高适应地俘获。In order to solve the problem of target adaptability and complexity of capture action planning in space noncooperative target clearing tasks,an envelope capture strategy based on reinforcement learning combined with multi-arm group coordination mechanism is proposed.Firstly,the physical model and kinematic model of the multi-arm trap mechanism are constructed,and then the reinforcement learning controller is designed by using the soft-actor-critic(SAC) algorithm and introducing pretraining(PT) method.Then the reward function is designed based on the “multi-arm grouping cooperation” reward mechanism to train the optimal capture action.In order to verify the high efficiency of capture strategy for single target operation and high adaptability for multi-target operation,simulation experiments are carried out on various targets respectively.Simulation results show that the proposed capture strategy can capture targets of various configurations efficiently and adaptively.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.137.159.3