基于深度强化学习的智能车辆行为决策研究被引量：3

Intelligent Vehicles Behavior Decision-making Based on Deep Reinforcement Learning

作　　者：周恒恒高松[1] 王鹏伟[1] 崔凯晨张宇龙 ZHOU Heng-heng;GAO Song;WANG Peng-wei;CUI Kai-chen;ZHANG Yu-long(School of Transportation and Vehicle Engineering,Shandong University of Technology,Zibo 255000,China)

机构地区：[1]山东理工大学交通与车辆工程学院,淄博255000

出　　处：《科学技术与工程》2024年第12期5194-5203,共10页Science Technology and Engineering

基　　金：国家自然科学基金(52102465)。

摘　　要：自动驾驶车辆决策系统直接影响车辆综合行驶性能,是实现自动驾驶技术需要解决的关键难题之一。基于深度强化学习算法DDPG(deep deterministic policy gradient),针对此问题提出了一种端到端驾驶行为决策模型。首先,结合驾驶员模型选取自车、道路、干扰车辆等共64维度状态空间信息作为输入数据集对决策模型进行训练,决策模型输出合理的驾驶行为以及控制量,为解决训练测试中的奖励和控制量突变问题,改进DDPG决策模型对决策控制效果进行优化,并在TORCS(the open racing car simulator)平台进行仿真实验验证。结果表明:所提出的决策模型可以根据车辆和环境实时状态信息输出合理的驾驶行为以及控制量,与DDPG模型相比,改进的模型具有更好的控制精度,且车辆横向速度显著减小,车辆舒适性以及车辆稳定性明显改善。Autonomous driving vehicle decision-making system has direct influence on driving performance.It is one of the key challenges to be addressed to realize fully autonomous driving.To solve this problem,a driving decision-making system based on deep reinforcement learning algorithm deep deterministic policy gradient(DDPG)was proposed.Firstly,a total of 64 dimensions of state spaces information such as ego vehicle information,road information and obstacle vehicle information on the basis of a driver model were selected as input variables of the constructed model.Then the decision-making was trained and outputs reasonable driving behaviors and control variable values.Finally,aiming at the problems of reward value and control variable values saltation,the DDPG decision model was improved to optimize decision control effect.To verify the performance of the proposed decision making model,simulation experiments were conducted on the open racing car simulator(TORCS)platform.The results show that the proposed decision-making model can output reasonable driving behaviors and accurate control quantities based on real-time state information of vehicles and environment.Compared with the DDPG model,the improved decision-making model has better control accuracy,significantly reduces vehicle lateral speed,improves vehicle comfort and stability.

关键词：自动驾驶行为决策深度强化学习深度确定性策略梯度算法

分类号：U463[机械工程—车辆工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于深度强化学习的智能车辆行为决策研究被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于深度强化学习的智能车辆行为决策研究 被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于深度强化学习的智能车辆行为决策研究被引量：3