基于PPO的移动平台自主导航被引量：2

Autonomous navigation based on PPO for mobile platform

作　　者：徐国艳[1] 熊绎维周彬[1] 陈冠宏 XU Guoyan;XIONG Yiwei;ZHOU Bin;CHEN Guanhong(Key Laboratory of Autonomous Transportation Technology for Special Vehicles,Ministry of Industry and Information Technology,School of Transportation Science and Engineering,Beihang University,Beijing 100191,China)

机构地区：[1]北京航空航天大学交通科学与工程学院特种车辆无人运输技术工业和信息化部重点实验室,北京100191

出　　处：《北京航空航天大学学报》2022年第11期2138-2145,共8页Journal of Beijing University of Aeronautics and Astronautics

基　　金：国家自然科学基金(51775016)。

摘　　要：为解决强化学习算法在自主导航任务中动作输出不连续、训练收敛困难等问题,提出了一种基于近似策略优化(PPO)算法的移动平台自主导航方法。在PPO算法的基础上设计了基于正态分布的动作策略函数,解决了移动平台整车线速度和横摆角速度的输出动作连续性问题。设计了一种改进的人工势场算法作为自身位置评价,有效提高强化学习模型在自主导航场景中的收敛速度。针对导航场景设计了模型的网络框架和奖励函数,并在Gazebo仿真环境中进行模型训练,结果表明,引入自身位置评价的模型收敛速度明显提高。将收敛模型移植入真实环境中,验证了所提方法的有效性。This paper presents an autonomous navigation method based on proximal policy optimization(PPO) algorithm for mobile platform.In this method,GNSS and LADAR are used for sensing environment information.To define the state of reinforcement learning model,an ego position evaluation method is introduced based on improved artificial potential field algorithm.After that,on the basis of PPO algorithm,a kind of action policy function is designed based on Gaussian distribution,which solves the continuity problem of the vehicle linear velocity and yaw velocity.Furthermore,the network framework and reward function of the model are also designed for navigation scenarios.In order to train the navigation model,a virtual environment based on Gazebo is built.The training results show that the ego position evaluation method obviously helps to improve the speed of model convergence.Finally,the navigation model is transplanted to a real environment,which verifies the effectiveness of the proposed method.

关键词：近似策略优化算法移动平台自主导航强化学习人工势场

分类号：TP242.6[自动化与计算机技术—检测技术与自动化装置]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于PPO的移动平台自主导航被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于PPO的移动平台自主导航 被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于PPO的移动平台自主导航被引量：2