基于逆向强化学习的纵向自动驾驶决策方法被引量：8

A Decision-making Method for Longitudinal Autonomous Driving Based on Inverse Reinforcement Learning

作　　者：高振海[1] 闫相同高菲[1] Gao Zhenhai;Yan Xiangtong;Gao Fei(Jilin University,State Key Laboratory of Automotive Simulation and Control,Changchun 130022)

机构地区：[1]吉林大学,汽车仿真与控制国家重点实验室,长春130022

出　　处：《汽车工程》2022年第7期969-975,共7页Automotive Engineering

基　　金：国家重点研发计划项目(2017YFB0102601);国家自然科学基金(51775236,U1564214)资助。

摘　　要：基于人类驾驶员数据获得自动驾驶决策策略是当前自动驾驶技术研究的热点。经典的强化学习决策方法大多通过设计安全性、舒适性、经济性相关公式人为构建奖励函数,决策策略与人类驾驶员相比仍然存在较大差距。本文中使用最大边际逆向强化学习算法,将驾驶员驾驶数据作为专家演示数据,建立相应的奖励函数,并实现仿驾驶员的纵向自动驾驶决策。仿真测试结果表明:相比于强化学习方法,逆向强化学习方法的奖励函数从驾驶员的数据中自动化的提取,降低了奖励函数的建立难度,得到的决策策略与驾驶员的行为具有更高的一致性。Obtaining autonomous driving decision-making strategies by using human driver data is a hot spot in current research on autonomous driving technology.Most of the classic reinforcement learning decision-making methods artificially construct reward functions by designing formulas related to safety,comfort,and economy,which leads to a big gap between decision-making strategies and human drivers.This paper uses the maximum margin inverse reinforcement learning algorithm.Taking the driver’s driving data as expert demonstration data,a reward function is established,and the longitudinal autonomous driving decision-making by imitating the driver is realized.The simulation test results show that compared with the reinforcement learning method,the reward function of the inverse reinforcement learning method is automatically extracted from the driver's data,which reduces the difficulty of establishing the reward function,and the obtained decision-making strategy has higher consistency with the driver’s behavior.

关键词：自动驾驶决策算法强化学习逆向强化学习

分类号：U463.6[机械工程—车辆工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于逆向强化学习的纵向自动驾驶决策方法被引量：8

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于逆向强化学习的纵向自动驾驶决策方法 被引量：8

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于逆向强化学习的纵向自动驾驶决策方法被引量：8