检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李宁 何义良 赵建辉 刘兆威 田志 LI Ning;HE Yiliang;ZHAO Jianhui;LIU Zhaowei;TIAN Zhi(Hengshui Power Supply Branch,State Grid Hebei Electric Power Co.,Ltd.,Hengshui 053000,Hebei,China)
机构地区:[1]国网河北省电力有限公司衡水供电分公司,河北衡水053000
出 处:《电网与清洁能源》2024年第11期9-15,共7页Power System and Clean Energy
基 金:国家电网有限公司科技项目(kj2021-044)。
摘 要:为实现电网带电作业机器人手臂的精准导航,提出全局加权奖励机制,建立基于全局加权奖励机制和双深度Q网络算法的机器人手臂精准导航模型,解决了Q值过估计和更新效率低的问题。研究仿真机器人手臂跨线作业避障和导航,结果表明:学习率最佳值为0.005,全局加权奖励机制相比当前状态即时奖励,更能够提高Q值更新效率;基于全局加权奖励机制和双深度Q网络算法建立跨线作业模型,得到收敛后的偏差降为±6.45。基于全局加权奖励机制和双深度Q网络算法建立机器人手臂精准导航模型,其收敛速度和准确性都有所提升,实现了机器人带电作业的精准导航。In order to achieve the precise navigation of the live working manipulator(robot arms)in the power grid,the global weighted reward mechanism is proposed,and an advanced accurate navigation model of the manipulator based on the mechanism of global weighted reward and the algorithm of double-depth Q network is built to solve the issue of Q-value overestimation and low update efficiency.The obstacle avoidance and navigation of the robotic arms during the cross-line operation are studied,and the result shows that the best learning rate is 0.005 and the global weighted reward mechanism,compared to the immediate reward of the current state,can more effectively improve the efficiency of Q-value updates;and the convergence deviation of the cross-line operation model based on the global weighted reward mechanism and the double-depth Q network algorithm reduces to±6.45.The advanced precise navigation model of the DDQN robot arm established based on the global weighted reward mechanism has stronger generalization performance and realizes the accurate navigation of the robot live operation.
关 键 词:带电作业 机械臂 深度强化学习 双深度Q网络 精准导航
分 类 号:TP242[自动化与计算机技术—检测技术与自动化装置]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.19.244.133