基于分布式强化学习的车辆控制算法研究被引量：7

Research on Vehicle Control Algorithm Based on Distributed Reinforcement Learning

作　　者：刘卫国项志宇[1] 刘伟平齐道新王子旭 Liu Weiguo;Xiang Zhiyu;Liu Weiping;Qi Daoxin;Wang Zixu(School of Information and Electronic Engineering,Zhejiang University,Hangzhou 310058;National Innovation Center of Intelligent and Connected Vehicles,Beijing 100160)

机构地区：[1]浙江大学信息与电子工程学院,杭州310058 [2]国家智能网联汽车创新中心,北京100160

出　　处：《汽车工程》2023年第9期1637-1645,共9页Automotive Engineering

基　　金：自动驾驶国家新一代人工智能开放创新平台项目(2020AAA0103702)资助。

摘　　要：端到端自动驾驶算法的开发现已成为当前自动驾驶技术研发的热点。经典的强化学习算法利用车辆状态、环境反馈等信息训练车辆行驶,通过试错学习获得最佳策略,实现了端到端的自动驾驶算法开发,但仍存在开发效率低下的问题。为解决虚拟仿真环境下训练强化学习算法的低效性和高复杂度问题,本文提出了一种异步分布式强化学习框架,并建立了进程间和进程内的多智能体并行柔性动作-评价(soft actor-critic,SAC)分布式训练框架,加速了Carla模拟器上的在线强化学习训练。同时,为进一步实现模型的快速训练和部署,本文提出了一种基于Cloud-OTA的分布式模型快速训练和部署系统架构,系统框架主要由空中下载技术(over-the-air technology,OTA)平台、云分布式训练平台和车端计算平台组成。在此基础上,本文为了提高模型的可复用性并降低迁移部署成本,搭建了基于ROS的Autoware-Carla集成验证框架。实验结果表明,本文方法与多种主流自动驾驶方法定性相比训练速度更快,能有效地应对密集交通流道路工况,提高了端到端自动驾驶策略对未知场景的适应性,减少在实际环境中进行实验所需的时间和资源。The development of end-to-end autonomous driving algorithms has become a hot topic in current autonomous driving technology research and development.Classic reinforcement learning algorithms leverage information such as vehicle state and environmental feedback to train the vehicle for driving,through trial-and-error learning to obtain the best strategy,so as to achieve the development of end-to-end autonomous driving algorithms.However,there is still the problem of low development efficiency.The article proposes an asynchronous distributed reinforcement learning framework to address the inefficiency and high complexity problems in training RL algorithms in virtual simulation environment,establishes intra and inter process multi-agent parallel Soft Actor-Critic(SAC)distributed training framework on the Carla simulator to accelerate online RL training.Additionally,to achieve rapid model training and deployment,the article proposes a distributed model training and deployment system architecture based on Cloud-OTA,which mainly consists of an Over-the-Air Technology(OTA)platform,a cloud-based distributed training platform,and an on-vehicle computing platform.On this basis,the paper establishes an Autoware-Carla integrated validation framework based on ROS to improve model reusability and reduce migration and deployment cost.The experimental results show that compared with various mainstream autonomous driving methods,the method proposed in this paper has a faster training speed qualitatively,which can effectively copewith dense traffic flow and improve the adaptability of end-to-end autonomous driving strategies to unknown scenes,and reduce the time and resources required for experimentation in actual environment.

关键词：强化学习分布式多智能体自动驾驶 Carla 车辆控制

分类号：U463.6[机械工程—车辆工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于分布式强化学习的车辆控制算法研究被引量：7

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于分布式强化学习的车辆控制算法研究 被引量：7

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于分布式强化学习的车辆控制算法研究被引量：7