基于模仿学习和强化学习的智能车辆换道行为决策被引量：21

Lane⁃change Behavior Decision⁃making of Intelligent Vehicle Based on Imitation Learning and Reinforcement Learning

作　　者：宋晓琳[1] 盛鑫曹昊天[1] 李明俊易滨林黄智[1] Song Xiaolin;Sheng Xin;Cao Haotian;Li Mingjun;Yi Binlin;Huang Zhi(Hunan University,State Key Laboratory of Advanced Design and Manufacturing for Vehicle Body,Changsha 410082)

机构地区：[1]湖南大学汽车车身先进设计与制造国家重点实验室,长沙410082

出　　处：《汽车工程》2021年第1期59-67,共9页Automotive Engineering

基　　金：国家自然科学基金(51975194);国家自然科学基金青年科学基金(51905161)资助。

摘　　要：本文中提出了一种基于模仿学习和强化学习的智能车辆换道行为决策方法。其中宏观决策模块通过模仿学习构建极端梯度提升模型,根据输入信息在车道保持、左换道和右换道中选择宏观决策指令,以此确定所需求解的换道行为决策子问题;各细化决策子模块通过深度确定性策略梯度强化学习方法得到优化策略,求解相应换道行为决策子问题,以确定车辆运动目标位置并下发执行。仿真结果表明:本文中提出方法的策略学习速度比单纯强化学习方法快,且其综合性能优于有限状态机、行为克隆模仿学习和单纯强化学习等方法。A lane⁃change behavior decision⁃making method of the intelligent vehicle is proposed based on imitation learning and reinforcement learning,in which the macro decision⁃making module constructs the extreme gradient boosting model through imitation learning,and selects the macro instructions(lane⁃keeping,left lane⁃change and right lane⁃change)according to the input information,so as to determine the sub⁃problem of lane⁃change behavior decision that need to be solved.Each detailed decision⁃making sub⁃module acquires its opti⁃mized strategy through the reinforcement learning of deep deterministic strategy gradient to solve the corresponding sub⁃problem for determining the movement target position of ego⁃vehicle and sending to lower⁃level modules for exe⁃cution.Simulation results show that the strategy learning speed of the proposed method is faster than that of pure re⁃inforcement learning,and its comprehensive performance is better than that of finite state machine,behavior clone imitation learning and pure reinforcement learning.

关键词：智能车辆行为决策强化学习模仿学习

分类号：TP181[自动化与计算机技术—控制理论与控制工程] U463.6[自动化与计算机技术—控制科学与工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于模仿学习和强化学习的智能车辆换道行为决策被引量：21

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于模仿学习和强化学习的智能车辆换道行为决策 被引量：21

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于模仿学习和强化学习的智能车辆换道行为决策被引量：21