检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:宋晓琳[1] 盛鑫 曹昊天[1] 李明俊 易滨林 黄智[1] Song Xiaolin;Sheng Xin;Cao Haotian;Li Mingjun;Yi Binlin;Huang Zhi(Hunan University,State Key Laboratory of Advanced Design and Manufacturing for Vehicle Body,Changsha 410082)
机构地区:[1]湖南大学汽车车身先进设计与制造国家重点实验室,长沙410082
出 处:《汽车工程》2021年第1期59-67,共9页Automotive Engineering
基 金:国家自然科学基金(51975194);国家自然科学基金青年科学基金(51905161)资助。
摘 要:本文中提出了一种基于模仿学习和强化学习的智能车辆换道行为决策方法。其中宏观决策模块通过模仿学习构建极端梯度提升模型,根据输入信息在车道保持、左换道和右换道中选择宏观决策指令,以此确定所需求解的换道行为决策子问题;各细化决策子模块通过深度确定性策略梯度强化学习方法得到优化策略,求解相应换道行为决策子问题,以确定车辆运动目标位置并下发执行。仿真结果表明:本文中提出方法的策略学习速度比单纯强化学习方法快,且其综合性能优于有限状态机、行为克隆模仿学习和单纯强化学习等方法。A lane⁃change behavior decision⁃making method of the intelligent vehicle is proposed based on imitation learning and reinforcement learning,in which the macro decision⁃making module constructs the extreme gradient boosting model through imitation learning,and selects the macro instructions(lane⁃keeping,left lane⁃change and right lane⁃change)according to the input information,so as to determine the sub⁃problem of lane⁃change behavior decision that need to be solved.Each detailed decision⁃making sub⁃module acquires its opti⁃mized strategy through the reinforcement learning of deep deterministic strategy gradient to solve the corresponding sub⁃problem for determining the movement target position of ego⁃vehicle and sending to lower⁃level modules for exe⁃cution.Simulation results show that the strategy learning speed of the proposed method is faster than that of pure re⁃inforcement learning,and its comprehensive performance is better than that of finite state machine,behavior clone imitation learning and pure reinforcement learning.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222