融合先验知识与引导策略搜索的机器人轴孔装配方法

Robotic pin-hole assembly method integrating prior knowledge and guided policy search

作　　者：陈豪杰董青卫刘锐楷曾鹏[1,2,3] Chen Haojie;Dong Qingwei;Liu Ruikai;Zeng Peng(State Key Laboratory of Robotics,Shenyang Institute of Automation,Chinese Academy of Sciences,Shenyang 110016,China;Key Laboratory of Networked Control Systems,Chinese Academy of Sciences,Shenyang 110016,China;Institutes for Robotics&Intelligent Manufacturing,Chinese Academy of Sciences,Shenyang 110169,China;University of Chinese Academy of Sciences,Beijing 100049,China)

机构地区：[1]中国科学院沈阳自动化研究所机器人学国家重点实验室,沈阳110016 [2]中国科学院网络化控制系统重点实验室,沈阳110016 [3]中国科学院机器人与智能制造创新研究院,沈阳110169 [4]中国科学院大学,北京100049

出　　处：《计算机应用研究》2025年第4期1018-1024,共7页Application Research of Computers

基　　金：国家自然科学基金资助项目(92267301,92067205,92267205);辽宁省自然科学基金资助项目(2024-MSBA-83);机器人学国家重点实验室(2023-Z15);国家博士后基金资助项目(GZB20230805)。

摘　　要：在现代工业自动化领域,机器人执行复杂装配任务的能力至关重要。尽管强化学习为机器人策略学习提供了一种有效途径,但在装配任务的策略训练初始阶段存在采样效率低和样本质量差的问题,导致算法收敛速度慢,容易陷入局部最优解。针对上述问题,提出了一种融合先验知识与引导策略搜索算法的机器人轨迹规划方法。该方法首先利用人类专家演示和历史任务数据的先验知识来构建初始策略,并将先验知识保留在经验池中,以提高学习效率;随后,通过引导策略搜索算法对初始策略进行在线优化,逐步提升策略的精确度和适应性;最后,通过机器人轴孔装配任务进行实验验证,该方法显著提高了策略学习效率,减少了训练时间和试错次数。研究表明,融合先验知识的方法可以有效提高强化学习学习效率,使机器人快速得到能够完成装配任务的策略。In modern industrial automation,robots play a crucial role in performing complex assembly tasks.Reinforcement learning provides an effective approach for robot strategy learning,but it encounters challenges such as low sampling efficiency and poor sample quality during the early stages of strategy training.These challenges slow down algorithm convergence and increase the risk of getting stuck in local optima.To address these issues,this paper presented a robot trajectory planning method that integrated prior knowledge with the guided policy search algorithm.The method drew on prior knowledge from human expert demonstrations and historical task data to build an initial policy and stored this knowledge in an experience pool to improve learning efficiency.The guided policy search algorithm optimized the policy online,gradually enhancing the precision and adaptability of the strategy.The research team conducted experiments on a robotic pin-hole assembly task and found that this method significantly improved strategy learning efficiency,reduced training time,and minimized trial-and-error iterations.The results show that integrating prior knowledge effectively improves the learning efficiency of reinforcement lear-ning,allowing robots to quickly obtain strategies that can complete assembly tasks.

关键词：强化学习先验知识引导策略搜索策略优化轴孔装配任务

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

融合先验知识与引导策略搜索的机器人轴孔装配方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

融合先验知识与引导策略搜索的机器人轴孔装配方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索