基于PD3PG的无人驾驶行为决策仿真

Simulation on unmanned driving behaviour decision-making based on PD3PG

作　　者：曹克让王涵刘亚茹[1,2] 范慧杰梁琳琦[1,2] CAO Ke-rang;WANG Han;LIU Ya-ru;FAN Hui-jie;LIANG Lin-qi(Department of Computer Science and Technology,Shenyang University of Chemical Technology,Shenyang 110142,China;Key Laboratory of Industrial Intelligence Technology on Chemical Processes,Shenyang University of Chemical Technology,Shenyang 110142,China;State Key Laboratory of Robotics,Shenyang Institute of Automation,Chinese Academy of Sciences,Shenyang 110016,China)

机构地区：[1]沈阳化工大学计算机科学与技术学院,辽宁沈阳110142 [2]沈阳化工大学辽宁省化工过程工业智能化技术重点实验室,辽宁沈阳110142 [3]中国科学院沈阳自动化研究所机器人学国家重点实验室,辽宁沈阳110016

出　　处：《计算机工程与设计》2025年第4期1149-1156,共8页Computer Engineering and Design

基　　金：国家自然科学基金委面上基金项目(62273339);2023年辽宁省教育厅基本科研面上基金项目(JYTMS20231518)。

摘　　要：为提高无人驾驶车辆的行为决策控制能力,将深度强化学习中的DDPG算法应用到无人驾驶行为决策中。提出一种将混合优先经验回放机制以及决斗网络结合的确定性策略梯度算法PD3PG。构建无人驾驶行为决策模型,设计合理的奖励函数。提出PD3PG算法,提高重要经验的利用率以及加快神经网络的训练速度。通过仿真平台TORCS,验证了PD3PG算法相比于DDPG算法拥有更快的收敛速度,更高的回合奖励,以及更加稳定的偏移量,行为决策控制效果更加优秀。To improve the behavior decision-making control ability of driverless vehicles,the DDPG algorithm in deep reinforcement learning was applied to driverless behavior decision-making.A deterministic policy gradient algorithm PD3PG combining hybrid priority experience playback mechanism and duel network was proposed.The decision-making model of unmanned driving behavior was constructed,and a reasonable reward function was designed.The PD3PG algorithm was proposed to improve the utilization of important experiences as well as to speed up the training of neural networks.Through the simulation platform TORCS,it is verified that the PD3PG algorithm has higher convergence speed,higher round reward,and more stable offset than the DDPG algorithm,and the behavior decision control effect is better.

关键词：深度强化学习深度确定性策略梯度算法无人驾驶行为决策奖励函数经验回放决斗网络

分类号：TP399[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于PD3PG的无人驾驶行为决策仿真

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于PD3PG的无人驾驶行为决策仿真

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索