基于异构融合特征的深度强化学习自动驾驶决策方法  

Method of Deep Reinforcement Learning Autonomous Driving Strategy Based on Heterogeneous Fusion Features

在线阅读下载全文

作  者:冯天 石朝侠[1] 王燕清 FENG Tian;SHI Chaoxia;WANG Yanqing(College of Computer Science and Engineering,Nanjing University of Science and Technology,Nanjing 210094;Institute of Information Engineering,Nanjing Xiaozhuang University,Nanjing 211171)

机构地区:[1]南京理工大学计算机科学与工程学院,南京210094 [2]南京晓庄学院信息工程学院,南京211171

出  处:《计算机与数字工程》2022年第9期1929-1934,共6页Computer & Digital Engineering

基  金:国防科技创新特区火花课题(编号:2016300TS009091);国家自然科学基金面上项目(编号:61371040)资助。

摘  要:在自动驾驶决策方法中,传统模块化方法受限制于数据集的广泛性,基于强化学习的方法难以在高输入维度且动作空间连续的情况下有效学习。为了解决上述问题,提出了一种基于异构融合特征的深度强化学习自动驾驶决策方法,首先使用适量驾驶数据预训练图像降维网络,然后将降维后得到的图像特征和车辆状态特征进行异构融合作为强化学习的输入,采用深度确定性策略梯度(Deep Deterministic Policy Gradient,DDPG)强化学习框架,通过为自动驾驶量身定制的综合了速度、方向盘角度、车辆位置、碰撞等信息的奖励函数有效引导学习,结合经验池回放技术和目标网络技术提高训练收敛速度。所提方法有效缩短了训练时间,并可在复杂城市环境下保持较高的稳定性与鲁棒性。In autonomous driving strategy,traditional modular methods are limited by the wide range of data sets,and it is difficult for methods based on reinforcement learning to learn effectively in the case of high dimensional input and continuous action space. In order to solve the above problems,a deep reinforcement learning automatic driving strategy based on heterogeneous fusion features is proposed. Firstly,an appropriate amount of driving data is used to pre-train the Image Dimensionality Reduction Network,and then the image features obtained from Image Dimensionality Reduction Network are heterogeneous fusion with the vehicle state features. The fusion result is used as the input of Deep Deterministic Policy Gradient(DDPG)reinforcement learning framework. The autonomous driving strategy is effectively guided by a customized reward function that integrates speed,steering angle,vehicle position,collision and other information. It combines experience replay technology and target network technology to improve training convergence speed. The proposed method effectively shortens the training time,and maintain high stability and robustness in a complex urban environment.

关 键 词:深度强化学习 自动驾驶 异构融合特征 DDPG 奖励函数 

分 类 号:V323.19[航空宇航科学与技术—人机与环境工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象