基于PPO算法的不同驾驶风格跟车模型研究  

Study on Following Car Model with Different Driving Styles Based on Proximal Policy Optimization Algorithm

在线阅读下载全文

作  者:闫鑫 黄志球[1] 石帆 徐恒 YAN Xin;HUANG Zhiqiu;SHI Fan;XU Heng(School of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 210016,China)

机构地区:[1]南京航空航天大学计算机科学与技术学院,南京210016

出  处:《计算机科学》2024年第9期223-232,共10页Computer Science

基  金:国家自然科学基金联合基金项目(U2241216)。

摘  要:自动驾驶对于减少交通堵塞、提高驾驶舒适性具有非常重要的作用,如何提高人们对自动驾驶技术的接受程度仍具有重要的研究意义。针对不同需求的人群定制不同的驾驶风格,可以帮助驾驶人理解自动驾驶行为,提高驾驶人的乘车体验,在一定程度上消除驾驶人对使用自动驾驶系统的心理抵抗性。通过分析自动驾驶场景下的跟车行为,提出基于PPO算法的不同驾驶风格的深度强化学习模型设计方案。首先分析德国高速公路车辆行驶数据集(HDD)中大量驾驶行为轨迹,根据跟车时距(THW)、跟车距离(DHW)、行车加速度以及跟车速度特征进行归类,提取激进型的驾驶风格和稳健型的驾驶风格的特征数据,以此为基础编码能够反映驾驶人风格的奖励函数,经过迭代学习生成不同驾驶风格的深度强化学习模型,并在highway env平台上进行道路模拟。实验结果表明,基于PPO算法的不同风格驾驶模型具有完成任务目标的能力,且与传统的智能驾驶模型(IDM)相比,能够在驾驶行为中准确反映出不同的驾驶风格。Autonomous driving plays a crucial role in reducing traffic congestion and improving driving comfort.It remains of significant research importance to enhance public acceptance of autonomous driving technology.Customizing different driving styles for diverse user needs can aid drivers in understanding autonomous driving behavior,enhancing the overall driving experience,and reducing psychological resistance to using autonomous driving systems.This study proposes a design approach for deep reinforcement learning models based on the proximal policy optimization(PPO)algorithm,focusing on analyzing following behaviors in autonomous driving scenarios.Firstly,a large dataset of vehicle trajectories on German highways(HDD)is analyzed.The driving behaviors are classified based on features such as time headway(THW),distance headway(DHW),vehicle acceleration,and follo-wing speed.Characteristic data for aggressive and conservative driving styles are extracted.On this basis,an encoded reward function reflecting driver styles is developed.Through iterative learning,different driving style deep reinforcement learning models are generated using the PPO algorithm.Simulations are conducted on the highway environment platform.Experimental results de-monstrate that the PPO-based driving models with different styles possess the capability to achieve task objectives.Moreover,when compared to traditional intelligent driver model(IDM),these models accurately reflect distinct driving styles in driving behaviors.

关 键 词:自动驾驶 智能驾驶模型 强化学习 PPO算法 主成分分析 K-MEANS 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象