基于近端策略优化算法的燃料电池混合动力系统综合价值损耗最小能量管理方法  被引量:2

Comprehensive Value Depletion Minimization Energy Management Method for Fuel Cell Hybrid Systems Based on Proximal Policy Optimization Algorithm

在线阅读下载全文

作  者:李奇[1] 刘鑫 孟翔[1] 谭逸 杨明泽 张世聪 陈维荣[1] LI Qi;LIU Xin;MENG Xiang;TAN Yi;YANG Mingze;ZHANG Shicong;CHEN Weirong(School of Electrical Engineering,Southwest Jiaotong University,Chengdu 610031,Sichuan Province,China;Locomotive and Rolling Stock Research Institute of China Academy of Railway Sciences Group Co.,Ltd.,Haidian Distirct,Beijing 100081,China)

机构地区:[1]西南交通大学电气工程学院,四川省成都市610031 [2]中国铁道科学研究院集团有限公司机车车辆研究所,北京市海淀区100081

出  处:《中国电机工程学报》2024年第12期4788-4798,I0015,共12页Proceedings of the CSEE

基  金:国家自然科学基金项目(52377123);四川省自然科学基金项目(2022NSFSC0027);中国国家铁路集团有限公司科研开发计划重点课题(N2021J030)。

摘  要:为了降低市域动车组燃料电池混合动力系统运行燃料经济成本,提升燃料电池耐久性,该文提出一种基于近端策略优化算法的能量管理方法。该方法将混合动力系统能量管理问题建模为马尔可夫决策过程,以综合考虑燃料经济性和燃料电池耐久性的综合价值损耗最小为优化目标设置奖励函数,采用一种收敛速度较快的深度强化学习算法—近端策略优化算法求解,实现负载功率在燃料电池和锂电池间的合理有效分配,最后,采用市域动车组实际运行工况进行实验验证。实验结果表明,在训练工况下,所提方法相较基于等效氢耗最小能量管理方法和基于Q-learning能量管理方法,综合价值损耗分别降低19.71%和5.87%;在未知工况下,综合价值损耗分别降低18.05%和13.52%。结果表明,所提方法能够有效降低综合价值损耗,并具有较好的工况适应性。In order to reduce the fuel economy cost of fuel cell hybrid systems of city EMUs and improve the durability of the fuel cell,this paper proposes an energy management method based on proximal policy optimization algorithm.The method models the hybrid system energy management problem as a Markov decision process,and sets the reward function with the optimization objective of minimizing the comprehensive value depletion considering both fuel economy and fuel cell durability.Then,a deep reinforcement learning algorithm with high convergence speed,the proximal policy optimization algorithm,is used to solve the problem and achieve a reasonable and effective distribution of load power between the fuel cell and lithium battery,and finally,the actual operating conditions of EMUs are used for experimental verification.The experimental results show that the proposed method reduces the comprehensive value depletion by 19.71%and 5.87%under the training condition compared with the equivalent hydrogen consumption minimum and the Q-learning respectively,and reduces the comprehensive value depletion by 18.05%and 13.52%under the unknown condition respectively.The results show that the proposed method can effectively reduce the comprehensive value depletion and has good adaptability to working conditions.

关 键 词:燃料电池混合动力系统 深度强化学习 综合价值损耗 近端策略优化算法 能量管理 

分 类 号:TM911[电气工程—电力电子与电力传动]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象