基于滚动时域强化学习的智能车辆侧向控制算法被引量：2

Receding Horizon Reinforcement Learning Algorithm for Lateral Control of Intelligent Vehicles

作　　者：张兴龙陆阳李文璋徐昕[1] ZHANG Xing-Long;LU Yang;LI Wen-Zhang;XU Xin(College of Intelligence Science and Technology,National University of Defense Technology,Changsha 410073)

机构地区：[1]国防科技大学智能科学学院,长沙410073

出　　处：《自动化学报》2023年第12期2481-2492,共12页Acta Automatica Sinica

基　　金：国家重点研究发展计划(2018YFB1305105);国家自然科学基金(62003361,61825305)资助。

摘　　要：针对智能车辆的高精度侧向控制问题,提出一种基于滚动时域强化学习(Receding horizon reinforcement learning,RHRL)的侧向控制方法.车辆的侧向控制量由前馈和反馈两部分构成,前馈控制量由参考路径的曲率以及动力学模型直接计算得出;而反馈控制量通过采用滚动时域强化学习算法求解最优跟踪控制问题得到.提出的方法结合滚动时域优化机制,将无限时域最优控制问题转化为若干有限时域控制问题进行求解.与已有的有限时域执行器-评价器学习不同,在每个预测时域采用时间独立型执行器-评价器网络结构学习最优值函数和控制策略.与模型预测控制(Model predictive control,MPC)方法求解开环控制序列不同,RHRL控制器的输出是一个显式状态反馈控制律,兼具直接离线部署和在线学习部署的能力.此外,从理论上证明了RHRL算法在每个预测时域的收敛性,并分析了闭环系统的稳定性.在仿真环境中完成了结构化道路下的车辆侧向控制测试.仿真结果表明,提出的RHRL方法在控制性能方面优于现有先进算法,最后,以红旗E-HS3电动汽车作为实车平台,在封闭结构化城市测试道路和乡村起伏砂石道路下进行了侧向控制实验.实验结果显示,RHRL在结构化城市道路中的侧向控制性能优于预瞄控制,在乡村道路中具有较强的路面适应能力和较好的控制性能.This paper presents a receding horizon reinforcement learning(RHRL)algorithm for realizing high-accuracy lateral control of intelligent vehicles.The overall lateral control is composed of a feedforward control term that is directly computed using the curvature of the reference path and the dynamic model,and a feedback control term that is generated by solving an optimal control problem using the proposed RHRL algorithm.The proposed RHRL adopts a receding horizon optimization mechanism,and decomposes the infinite-horizon optimal control problem into several finite-horizon ones to be solved.Different from existing finite-horizon actor-critic learning algorithms,in each prediction horizon of RHRL,a time-independent actor-critic structure is utilized to learn the optimal value function and control policy.Also,compared with model predictive control(MPC),the control learned by RHRL is an explicit state-feedback control policy,which can be deployed directly offline or learned and deployed synchronously online.Moreover,the convergence of the proposed RHRL algorithm in each prediction horizon is proven and the stability analysis of the closed-loop system is peroformed.Simulation studies on a structural road show that,the proposed RHRL algorithm performs better than current state-of-the-art methods.The experimental studies on an intelligent driving platform built with a Hongqi E-HS3 electric car show that RHRL performs better than the pure pursuit method in the adopted structural city road scenario,and exhibits strong adaptability to road conditions and satisfactory control performance in the country road scenario.

关键词：滚动时域强化学习智能汽车侧向控制

分类号：U463.6[机械工程—车辆工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于滚动时域强化学习的智能车辆侧向控制算法被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于滚动时域强化学习的智能车辆侧向控制算法 被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于滚动时域强化学习的智能车辆侧向控制算法被引量：2