检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:吴戴燕 刘世林[2] Wu Daiyan;Liu Shilin(Department of Mechanical and Electrical Engineering,Anhui Lu'an Technician College,Lu'an 237001,China;School Of Electrical Engineering,Anhui Polytechnic University,Wuhu 241000,China)
机构地区:[1]安徽六安技师学院机电工程系,安徽六安237001 [2]安徽工程大学电子工程学院,安徽芜湖241000
出 处:《台州学院学报》2022年第6期13-20,共8页Journal of Taizhou University
基 金:安徽省高校自然科学研究重大项目(KJ2018ZD066);安徽省高校自然科学研究重点项目(KJ2019A1184)。
摘 要:为了提高实时机械臂规避障碍物的适应性,提出一种基于改进Q学习的控制规避方法。首先,利用深度增强学习对机械臂动作给予奖励和惩罚,并通过深度神经网络学习特征表示。然后,采用状态和动作集合以及环境迁移概率矩阵定义马尔科夫决策过程;同时,将归一化优势函数与Q学习算法相结合,以支持在连续空间中定义的机器人系统。实验结果表明:所提方法解决了Q学习收敛速度慢的缺点,实现了高性能机械臂的实时避障,有助于实现人机安全共存。To improve the adaptability of real-time manipulator to avoid obstacles,a control avoidance method based on improved Q-learning is proposed.Firstly,deep reinforcement learning is used to reward and punish the manip‐ulator action,and the feature representation is learned by deep neural network.Then,the Markov decision process is defined by state and action sets and environment migration probability matrix.At the same time,the normalized domi‐nance function is combined with Q-learning algorithm to support the robot system defined in continuous space.The ex‐perimental results show that the proposed method solves the disadvantage of slow convergence speed of Q-learning,re‐alizes real-time obstacle avoidance of high-performance manipulator,and is conducive to the safe coexistence of man and machine.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.104