改进行为克隆与DDPG的无人驾驶决策模型被引量：1

Improved Behavioral Cloning and DDPG’s Driverless Decision Model

作　　者：李伟东[1] 黄振柱何精武马草原葛程 LI Weidong;HUANG Zhenzhu;HE Jingwu;MA Caoyuan;GE Cheng(School of Automotive Engineering,Dalian University of Technology,Dalian,Liaoning 116024,China)

机构地区：[1]大连理工大学汽车工程学院,辽宁大连116024

出　　处：《计算机工程与应用》2024年第14期86-95,共10页Computer Engineering and Applications

基　　金：辽宁省科技创新重大专项(ZX20220560)。

摘　　要：无人驾驶技术的关键是决策层根据感知环节输入信息做出准确指令。强化学习和模仿学习比传统规则更适用于复杂场景。但以行为克隆为代表的模仿学习存在复合误差问题,使用优先经验回放算法对行为克隆进行改进,提升模型对演示数据集的拟合能力;原DDPG(deep deterministic policy gradient)算法存在探索效率低下问题,使用经验池分离以及随机网络蒸馏技术(random network distillation,RND)对DDPG算法进行改进,提升DDPG算法训练效率。使用改进后的算法进行联合训练,减少DDPG训练前期的无用探索。通过TORCS(the open racing car simulator)仿真平台验证,实验结果表明该方法在相同的训练次数内,能够探索出更稳定的道路保持、速度保持和避障能力。The key to driverless technology is that the decision-making level makes accurate instructions based on the input information of the perception link.Reinforcement learning and imitation learning are better suited for complex scenarios than traditional rules.However,the imitation learning represented by behavioral cloning has the problem of composite error,and this paper uses the priority empirical playback algorithm to improve the behavioral cloning to improve the fitting ability of the model to the demo dataset.The original DDPG(deep deterministic policy gradient)algorithm has the problem of low exploration efficiency,and the experience pool separation and random network distillation(RND)technology are used to improve the DDPG algorithm and improve the training efficiency of DDPG algorithm.The improved algorithm is used for joint training to reduce the useless exploration in the early stage of DDPG training.Verified by TORC(the open racing car simulator)simulation platform,the experimental results show that the proposed method can explore more stable road maintenance,speed maintenance and obstacle avoidance ability within the same number of training times.

关键词：无人驾驶强化学习模仿学习决策算法 TORCS

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

改进行为克隆与DDPG的无人驾驶决策模型被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

改进行为克隆与DDPG的无人驾驶决策模型 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

改进行为克隆与DDPG的无人驾驶决策模型被引量：1