检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:胡丹丹[1] 莫宇帅 Hu Dandan;Mo Yushuai(Robotics Institute,Civil Aviation University of China,Tianjin 300300,China)
出 处:《计算机应用研究》2020年第7期2068-2071,共4页Application Research of Computers
摘 要:针对基于案例推理启发式Q学习(CB-HAQL)算法受案例库质量影响而无法收敛到较优策略的问题,提出基于有效触发机制改进的CB-HAQL算法。首先,根据迭代次数设置触发式案例库更新机制,只在达到阈值时生成或更新案例库,保证案例库质量;其次,设置动态参数调整案例对动作选取影响,使智能体根据对环境掌握程度决定启发影响大小;最后,加入经验倾向性探索动作加快学习效率。实验证明,改进后的算法提升了策略质量和训练速度,无人机完成导航任务证明了学习策略的有效性。The quality of case base would affect the convergence effect of CB-HAQL algorithm strategy.Aiming at the fact,this paper developed an improved CB-HAQL algorithm based on effective triggering mechanism.Firstly,the algorithm set the trigger case base update mechanism according to the number of iterations.In order to ensure the quality of the case base,only when the threshold was reached,the algorithm generated or update the case base.Secondly,the dynamic parameter was set to adjust the impact of the case on action selection,so that the agent could determine the size of heuristic influence according to the degree of mastery of the environment.Finally,the algorithm added experience-oriented exploratory action to accelerate the learning efficiency.Experiments show that the algorithm improves the strategy quality and training speed,and the UAV’s navigation task proves the effectiveness of learning strategy.
关 键 词:无人机 避障 自主导航 CB-HAQL 触发机制
分 类 号:TP399[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7