检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:周恒恒 高松[1] 王鹏伟[1] 崔凯晨 张宇龙 ZHOU Heng-heng;GAO Song;WANG Peng-wei;CUI Kai-chen;ZHANG Yu-long(School of Transportation and Vehicle Engineering,Shandong University of Technology,Zibo 255000,China)
机构地区:[1]山东理工大学交通与车辆工程学院,淄博255000
出 处:《科学技术与工程》2024年第12期5194-5203,共10页Science Technology and Engineering
基 金:国家自然科学基金(52102465)。
摘 要:自动驾驶车辆决策系统直接影响车辆综合行驶性能,是实现自动驾驶技术需要解决的关键难题之一。基于深度强化学习算法DDPG(deep deterministic policy gradient),针对此问题提出了一种端到端驾驶行为决策模型。首先,结合驾驶员模型选取自车、道路、干扰车辆等共64维度状态空间信息作为输入数据集对决策模型进行训练,决策模型输出合理的驾驶行为以及控制量,为解决训练测试中的奖励和控制量突变问题,改进DDPG决策模型对决策控制效果进行优化,并在TORCS(the open racing car simulator)平台进行仿真实验验证。结果表明:所提出的决策模型可以根据车辆和环境实时状态信息输出合理的驾驶行为以及控制量,与DDPG模型相比,改进的模型具有更好的控制精度,且车辆横向速度显著减小,车辆舒适性以及车辆稳定性明显改善。Autonomous driving vehicle decision-making system has direct influence on driving performance.It is one of the key challenges to be addressed to realize fully autonomous driving.To solve this problem,a driving decision-making system based on deep reinforcement learning algorithm deep deterministic policy gradient(DDPG)was proposed.Firstly,a total of 64 dimensions of state spaces information such as ego vehicle information,road information and obstacle vehicle information on the basis of a driver model were selected as input variables of the constructed model.Then the decision-making was trained and outputs reasonable driving behaviors and control variable values.Finally,aiming at the problems of reward value and control variable values saltation,the DDPG decision model was improved to optimize decision control effect.To verify the performance of the proposed decision making model,simulation experiments were conducted on the open racing car simulator(TORCS)platform.The results show that the proposed decision-making model can output reasonable driving behaviors and accurate control quantities based on real-time state information of vehicles and environment.Compared with the DDPG model,the improved decision-making model has better control accuracy,significantly reduces vehicle lateral speed,improves vehicle comfort and stability.
关 键 词:自动驾驶 行为决策 深度强化学习 深度确定性策略梯度算法
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.145