机构地区:[1]合肥工业大学汽车与交通工程学院,安徽合肥230009
出 处:《中国公路学报》2024年第3期157-169,共13页China Journal of Highway and Transport
基 金:国家自然科学基金项目(U22A20246,52372382);合肥市自然科学基金项目(2022008)。
摘 要:为了解决智能车辆在工况变化时跟踪精度下降和稳定性变差的问题,提出基于强化学习的变参数模型预测控制(MPC)算法多目标控制策略,实现智能车辆路径跟踪控制系统的参数自适应整定。基于车辆动力学模型设计其线性时变MPC控制器,获得最优前轮转向角和附加横摆力矩。基于Actor-Critic强化学习架构,设计进行控制参数整定的深度确定性策略梯度(DDPG)智能体和双延迟深度确定性策略梯度(TD3)智能体,构造以跟踪精度和稳定性为目标的收益函数,并搭建对接工况和变曲率工况2种典型仿真场景进行算法性能验证,当车辆处于对接工况时,根据路面附着系数的变化及时调整控制器的预测时域和权重矩阵;当车辆处于变曲率工况下时,针对道路曲率变化及时调整控制器的预测时域和权重矩阵。通过MATLAB/SimuLink、CarSim和Python联合仿真分析,将强化学习方法参数整定MPC与固定参数MPC和模糊控制方法参数整定MPC进行对比,结果表明:强化学习方法更能够在保证车辆安全性的前提下,尽可能提高智能车辆在不同路面条件下的路径跟踪精度。在对接工况下,强化学习方法参数整定MPC相较于固定参数MPC和模糊控制方法参数整定MPC,横向偏差平均值分别减少了99.8%和97.6%,前轮转角变化率平均值分别减小了99.7%和77.0%;变曲率工况下,横向偏差平均值分别减少了79.6%和90.8%,前轮转角变化率平均值分别减小了40.6%和2.6%。说明所提出的基于强化学习的智能车辆径跟踪变参数MPC多目标控制能够解决变工况下的路径跟踪的稳定性和跟踪精度控制问题,为复杂场景下的路径跟踪控制提供了一种思路。To address the problems of tracking accuracy degradation and stability deterioration when operating intelligent vehicles under changing driving conditions,a multi-objective control strategy based on reinforcement learning variable parameter model predictive control(MPC)algorithm was proposed in this study.The proposed method effectively realizes the parameter adaptive tuning of intelligent vehicle path tracking control system.The proposed linear time-varying MPC controller was designed based on a vehicle dynamics model to obtain the optimal front-wheel steering angle and additional yaw moment.Based on the Actor-Critic reinforcement learning architecture,the Deep Deterministic Policy Gradient(DDPG)and Twin Delayed Deep Deterministic Policy Gradient(TD3)agents were designed for control parameter tuning.The gain function was constructed with tracking accuracy and system stability as the goal,and two typical simulation scenarios of docking road and variable curvature road were constructed for the algorithm performance verification.For the docking road scenario,the prediction horizon and weight matrix of the controller were adjusted in time according to the changes in the road adhesion coefficient.Whereas for the variable curvature road scenario,the prediction horizon and weight matrix of the controller were adjusted in time according to the changes in the road curvature.Through joint simulation analyses conducted using MATLAB/SimuLink,CarSim,and Python,the reinforcement learning-tuned MPC was compared with fixed parameter MPC and Fuzzy-tuned MPC models.The results showed that the reinforcement learning methods yielded the best performance regarding the path tracking accuracy of intelligent vehicles under different road conditions,while guaranteeing the vehicle safety as much as possible.Under the docking road condition,compared with the fixed parameter MPC and Fuzzy-tuned MPC models,the average lateral deviation of the vehicle was reduced by 99.8%and 97.6%,respectively,when using the reinforcement learning-tuned MPC,
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...