检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李伟东[1] 黄振柱 何精武 马草原 葛程 LI Weidong;HUANG Zhenzhu;HE Jingwu;MA Caoyuan;GE Cheng(School of Automotive Engineering,Dalian University of Technology,Dalian,Liaoning 116024,China)
机构地区:[1]大连理工大学汽车工程学院,辽宁大连116024
出 处:《计算机工程与应用》2024年第14期86-95,共10页Computer Engineering and Applications
基 金:辽宁省科技创新重大专项(ZX20220560)。
摘 要:无人驾驶技术的关键是决策层根据感知环节输入信息做出准确指令。强化学习和模仿学习比传统规则更适用于复杂场景。但以行为克隆为代表的模仿学习存在复合误差问题,使用优先经验回放算法对行为克隆进行改进,提升模型对演示数据集的拟合能力;原DDPG(deep deterministic policy gradient)算法存在探索效率低下问题,使用经验池分离以及随机网络蒸馏技术(random network distillation,RND)对DDPG算法进行改进,提升DDPG算法训练效率。使用改进后的算法进行联合训练,减少DDPG训练前期的无用探索。通过TORCS(the open racing car simulator)仿真平台验证,实验结果表明该方法在相同的训练次数内,能够探索出更稳定的道路保持、速度保持和避障能力。The key to driverless technology is that the decision-making level makes accurate instructions based on the input information of the perception link.Reinforcement learning and imitation learning are better suited for complex scenarios than traditional rules.However,the imitation learning represented by behavioral cloning has the problem of composite error,and this paper uses the priority empirical playback algorithm to improve the behavioral cloning to improve the fitting ability of the model to the demo dataset.The original DDPG(deep deterministic policy gradient)algorithm has the problem of low exploration efficiency,and the experience pool separation and random network distillation(RND)technology are used to improve the DDPG algorithm and improve the training efficiency of DDPG algorithm.The improved algorithm is used for joint training to reduce the useless exploration in the early stage of DDPG training.Verified by TORC(the open racing car simulator)simulation platform,the experimental results show that the proposed method can explore more stable road maintenance,speed maintenance and obstacle avoidance ability within the same number of training times.
关 键 词:无人驾驶 强化学习 模仿学习 决策算法 TORCS
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.28