基于输出层具有噪声的DQN的无人车路径规划  被引量:5

UGV Path Programming Based on the DQN With Noise in the Output Layer

在线阅读下载全文

作  者:李杨 闫冬梅 刘磊[1] LI Yang;YAN Dongmei;LIU Lei(College of Science,Hohai University,Nanjing 211100,P.R.China;School of Modern Posts,Nanjing University of Posts and Telecommunications,Nanjing 211100,P.R.China)

机构地区:[1]河海大学理学院,南京211100 [2]南京邮电大学现代邮政学院,南京211100

出  处:《应用数学和力学》2023年第4期450-460,共11页Applied Mathematics and Mechanics

基  金:国家自然科学基金(面上项目)(61773152)。

摘  要:在DQN算法的框架下,研究了无人车路径规划问题.为提高探索效率,将处理连续状态的DQN算法加以变化地应用到离散状态,同时为平衡探索与利用,选择仅在DQN网络输出层添加噪声,并设计了渐进式奖励函数,最后在Gazebo仿真环境中进行实验.仿真结果表明:①该策略能快速规划出从初始点到目标点的无碰撞路线,与Q-learning算法、DQN算法和noisynet_DQN算法相比,该文提出的算法收敛速度更快;②该策略关于初始点、目标点、障碍物具有泛化能力,验证了其有效性与鲁棒性.The path programming of the unmanned ground vehicle(UGV)was studied under the framework of the deep Q-network(DQN)algorithm.To improve the exploration efficiency,the DQN algorithm was applied through discretization of the continuous state into the discrete state.To balance between exploration and exploitation,the Gaussian noise was added only in the output layer of the network,and a progressive reward function was designed.Finally,experiments were carried out in the Gazebo simulation environment.The simulation results show that,first,this strategy can quickly program a collision-free route from the initial point to the target point,and the convergence speed is significantly higher than those of the Q-learning algorithm,the DQN algorithm and the noisynet_DQN algorithm;second,this strategy has the generalization ability about the initial point,the target point and the obstacles,as well as verified effectiveness and robustness.

关 键 词:深度强化学习 无人车 DQN算法 Gauss噪声 路径规划 Gazebo仿真 

分 类 号:O29[理学—应用数学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象