检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:吴忠强[1] 马博岩 WU Zhong-qiang;MA Bo-yan(Key Lab of Industrial Computer Control Engineering of Hebei Province,Yanshan University,Qinhuangdao,Hebei 066004,China)
机构地区:[1]燕山大学工业计算机控制工程河北省重点实验室,河北秦皇岛066004
出 处:《计量学报》2023年第12期1863-1871,共9页Acta Metrologica Sinica
基 金:河北省自然科学基金(F2020203014)。
摘 要:以并联式混合动力汽车(HEV)为研究对象,建立整车需求功率及动力系统模型,提出一种基于改进深度强化学习(DRL)的能量分配策略。通过改进DRL中的双延迟深度确定性策略梯度(TD3)算法,引入双重回放缓冲区,提出DRB-TD3算法以提升原算法的采样效率。设计了基于规则的约束控制器并嵌入到DRL结构中,以消除不合理的转矩分配。在UDDS行驶工况下,以基于动态规划(DP)的能量分配策略性能作为基准进行仿真实验。实验结果表明,与深度确定性策略梯度(DDPG)算法以及传统TD3算法相比,DRB-TD3算法收敛性能最佳,收敛效率分别提高了61.2%和31.6%;所提出的能量分配策略相比于基于DDPG和基于TD3的能量分配策略,平均燃油消耗分别降低了3.3%和2.3%,燃油经济性达到基于DP的95.2%,效果最佳,且电池荷电状态(SOC)能够保持在一个较好的水平,有助于延长电池的使用寿命。A parallel hybrid vehicle was studied to establish the demand power and power system model of the whole vehicle and proposed an energy distribution strategy based on improved Deep Reinforcement Learning(DRL).The DRB-TD3 algorithm was proposed to improve the sampling efficiency of the original algorithm by improving the Twin Delayed Deep Deterministic Policy Gradient(TD3)algorithm in DRL and introduced dual replay buffers.A rule-based constraint controller was designed and embedded into the algorithm structure to eliminate unreasonable torque allocation.The performance of the Dynamic Planning(DP)-based energy distribution strategy was used as a benchmark for simulation experiment under UDDS driving conditions.The experimental results show that the DRB-TD3 algorithm has the best convergence performance compared with the Deep Deterministic Policy Gradient(DDPG)algorithm and the conventional TD3 algorithm,with 61.2%and 31.6%improvement in convergence efficiency,respectively.The proposed energy distribution strategy reduces the average fuel consumption by 3.3%and 2.3%compared with the DDPG-and TD3-based energy distribution strategies,respectively.The fuel performance reaches 95.2%of DP-based,which with the best fuel economy,and the battery state of charge(SOC)can be maintained at a better level,which helps to extend the battery life.
关 键 词:并联式混合动力汽车 能量分配策略 深度强化学习 TD3算法 荷电状态
分 类 号:TB971[一般工业技术—计量学]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7