基于改进深度强化学习的HEV能量分配策略研究被引量：1

Research on HEV Energy Distribution Strategy Based on Improved Deep Reinforcement Learning

作　　者：吴忠强[1] 马博岩 WU Zhong-qiang;MA Bo-yan(Key Lab of Industrial Computer Control Engineering of Hebei Province,Yanshan University,Qinhuangdao,Hebei 066004,China)

机构地区：[1]燕山大学工业计算机控制工程河北省重点实验室,河北秦皇岛066004

出　　处：《计量学报》2023年第12期1863-1871,共9页Acta Metrologica Sinica

基　　金：河北省自然科学基金(F2020203014)。

摘　　要：以并联式混合动力汽车(HEV)为研究对象,建立整车需求功率及动力系统模型,提出一种基于改进深度强化学习(DRL)的能量分配策略。通过改进DRL中的双延迟深度确定性策略梯度(TD3)算法,引入双重回放缓冲区,提出DRB-TD3算法以提升原算法的采样效率。设计了基于规则的约束控制器并嵌入到DRL结构中,以消除不合理的转矩分配。在UDDS行驶工况下,以基于动态规划(DP)的能量分配策略性能作为基准进行仿真实验。实验结果表明,与深度确定性策略梯度(DDPG)算法以及传统TD3算法相比,DRB-TD3算法收敛性能最佳,收敛效率分别提高了61.2%和31.6%;所提出的能量分配策略相比于基于DDPG和基于TD3的能量分配策略,平均燃油消耗分别降低了3.3%和2.3%,燃油经济性达到基于DP的95.2%,效果最佳,且电池荷电状态(SOC)能够保持在一个较好的水平,有助于延长电池的使用寿命。A parallel hybrid vehicle was studied to establish the demand power and power system model of the whole vehicle and proposed an energy distribution strategy based on improved Deep Reinforcement Learning(DRL).The DRB-TD3 algorithm was proposed to improve the sampling efficiency of the original algorithm by improving the Twin Delayed Deep Deterministic Policy Gradient(TD3)algorithm in DRL and introduced dual replay buffers.A rule-based constraint controller was designed and embedded into the algorithm structure to eliminate unreasonable torque allocation.The performance of the Dynamic Planning(DP)-based energy distribution strategy was used as a benchmark for simulation experiment under UDDS driving conditions.The experimental results show that the DRB-TD3 algorithm has the best convergence performance compared with the Deep Deterministic Policy Gradient(DDPG)algorithm and the conventional TD3 algorithm,with 61.2%and 31.6%improvement in convergence efficiency,respectively.The proposed energy distribution strategy reduces the average fuel consumption by 3.3%and 2.3%compared with the DDPG-and TD3-based energy distribution strategies,respectively.The fuel performance reaches 95.2%of DP-based,which with the best fuel economy,and the battery state of charge(SOC)can be maintained at a better level,which helps to extend the battery life.

关键词：并联式混合动力汽车能量分配策略深度强化学习 TD3算法荷电状态

分类号：TB971[一般工业技术—计量学]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于改进深度强化学习的HEV能量分配策略研究被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于改进深度强化学习的HEV能量分配策略研究 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于改进深度强化学习的HEV能量分配策略研究被引量：1