基于MRD-DDPG的机械臂避障路径规划方法

Obstacle Avoidance Path Planning Method of Robotic Arm Based on MRD-DDPG

作　　者：付子强郑威强张立萍何丽[1] 袁亮邵明明 FU Ziqiang;ZHENG Weiqiang;ZHANG Liping;HE Li;YUAN Liang;SHAO Mingming(School of Mechanical Engineering,Xinjiang University,Urumqi 830047,China;School of Information Science and Technology,Beijing University of Chemical Technology,Beijing 100029,China)

机构地区：[1]新疆大学机械工程学院,乌鲁木齐830047 [2]北京化工大学信息科学与技术学院,北京100029

出　　处：《组合机床与自动化加工技术》2023年第7期41-45,共5页Modular Machine Tool & Automatic Manufacturing Technique

基　　金：国家自然科学基金项目(62063033);新疆维吾尔自治区科技支疆项目计划(2021E02049)。

摘　　要：提出将MRD-DDPG算法应用在机械臂避障路径规划上,解决了DDPG算法在训练过程中学习效率低、样本利用率低的问题。首先,在DDPG算法的基础上,通过改进经验池机制,提出多经验池延迟采样的深度确定性策略梯度(multi-replay buffer delay sampling-deep deterministic policy gradient,MRD-DDPG)算法,有效的缓解了样本利用率低的问题;其次,针对机械臂交互探索过程中奖励稀疏问题,设计了一种适用于避障路径规划的位置奖励函数,有效的提高了智能体的学习效率。实验结果表明,机械臂避障路径规划的平均成功率达97%左右;MRD-DDPG算法相比于DDPG算法的平均成功率提升了88%;机械臂的平均规划时间为0.638 s。In this study,MRD-DDPG algorithm is applied to obstacle avoidance path planning of manipulator,which solves the problem of low learning efficiency and low sample utilization of DDPG algorithm in training process.Firstly,on the basis of DDPG algorithm,by improving the experience pool mechanism,a multi-replay buffer delay sampling-deep deterministic policy gradient is proposed,which effectively alleviates the problem of low sample utilization efficiency.Secondly,a position reward function suitable for obstacle avoidance path planning is designed to solve the problem of reward sparseness in the interactive exploration process of manipulators,which effectively improve the learning efficiency of the agent.The experimental results show that the average success rate of obstacle avoidance path planning is about 97%.The average success rate of MRD-DDPG algorithm is 88%higher than that of DDPG algorithm.The average planning time of the manipulator is 0.638 s.

关键词：深度强化学习 DDPG 奖励函数机械臂路径规划

分类号：TH166[机械工程—机械制造及自动化] TG659[金属学及工艺—金属切削加工及机床]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于MRD-DDPG的机械臂避障路径规划方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于MRD-DDPG的机械臂避障路径规划方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索