检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]上海工程技术大学机械工程学院,上海201620
出 处:《计算机仿真》2016年第7期383-387,共5页Computer Simulation
基 金:高等学校骨干教师资助计划-高等学校青年骨干教师国内访问学者进修项目(A1-5300-15-020201);上海市高等学校科学技术发展基金-上海市高校实验技术队伍建设计划项目(A2-B-8950-13-0714)
摘 要:两轮机器人自平衡控制的难点在于提高机器人达到平衡的快速性和稳定性的能力。为解决传统强化学习算法收敛速度慢,系统易发散的问题,提出一种分层强化学习算法。将目标任务分解为若干个子任务,为每个子任务寻找最优策略,当所有的子目标都收敛到最优值时,目标任务也收敛到最优。在上述算法中,报酬函数可以从启发式的环境中学习,加快对未知环境的探索,快速达到自平衡并保持稳定。对两轮机器人进行自平衡仿真实验。仿真结果表明,相对于传统的强化学习算法,应用改进算法的两轮机器人的各控制状态的收敛特性及机器人的学习性能更强,有效的提高了机器人系统的稳定性控制性能。The difficulty of self balancing control of two wheeled robot is to improve the ability of the robot to a- chieve the balance rapidly and stably. In order to solve the problems of slow convergence speed and divergent system of traditional reinforcement learning algorithm, a hierarchical reinforcement learning algorithm was proposed in the paper. The algorithm decomposes target task into several subtasks and searches the optimal strategy for each task. When all sub-goals converge to the optimal value, the target task also converges to the optimal. In this algorithm, the compensation function can learn from the environment of heuristic, speed up the exploration of the unknown environment, achieve self balance quickly and maintain stability. The self balancing simulation experiment of two-wheeled robot was carried out using this algorithm. Simulation results show that compared with traditional reinforcement learning al- gorithm, the convergence properties of each control state and the learning performance of the two-wheeled robots are stronger by using this algorithm. Stability control of the robot system is improved.
分 类 号:TP242[自动化与计算机技术—检测技术与自动化装置]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15