基于多信息交互与深度强化学习的电动汽车充电导航策略  被引量:19

Electric Vehicle Charging Navigation Strategy Based on Multi-Information Interaction and Deep Reinforcement Learning

在线阅读下载全文

作  者:沈国辉 赵荣生 董晓 邢强 陈中[4] 袁浩 耿爱国 刘纪民 SHEN Guohui;ZHAO Rongsheng;DONG Xiao;XING Qiang;CHEN Zhong;YUAN Hao;GENG Aiguo;LIU Jimin(NARI Group Corporation,Nanjing 211106,China;Beijing Kedong Power Control System Corporation,Beijing 100194,China;State Grid Electric Vehicle Service Corporation,Beijing 100053,China;School of Electrical Engineering,Southeast University,Nanjing 210096,China)

机构地区:[1]南瑞集团有限公司,南京211106 [2]北京科东电力控制系统有限责任公司,北京100194 [3]国网电动汽车服务有限公司,北京100053 [4]东南大学电气工程学院,南京210096

出  处:《南方电网技术》2022年第1期108-116,共9页Southern Power System Technology

基  金:国家电网公司总部科技项目(5418-202018247A-0-0-00)。

摘  要:针对电动汽车动态行驶行为和随机充电行为的多信息融合特征以及多系统建模复杂度,提出了一种基于多信息交互与深度强化学习的电动汽车充电导航策略。该策略首先对“电动汽车集群优化储能云平台”采集的电动汽车实际运行数据进行建模与挖掘,通过数据预处理以及数据可视化显示得到电动汽车行驶、充电信息以及城市充电站信息。其次,分析了电动汽车充电调度过程符合马尔科夫决策定义,引入深度强化学习方法建立了充电导航模型。将“车-站-网”实时信息作为深度Q网络算法的状态空间,并将充电站的分配作为智能体的执行动作。通过对充电过程不同时段出行的成本和时间决策目标的评估,确定行驶途中与到站后的奖励函数。执行最高奖励对应的最优动作-值函数,为车主推荐最优充电站和规划行驶路径。最后,设计了多场景仿真算例验证了所提策略的可行性和有效性。For the multi-information fusion feature and multi-system modeling complexity of dynamic driving behavior and random charging behavior for electric vehicles(EVs), a charging navigation strategy of EVs based on multi-information interaction and deep reinforcement learning(DRL) is proposed in this paper. In this strategy, the EV actual operation data collected by ‘optimization energy storage cloud platform of electric vehicle clusters’ is firstly modeled and mined. Through data preprocessing and data visual display, the EV driving and charging information as well as urban charging station information are obtained. Then, the EV charging scheduling process conforming to Markov decision process(MDP) is analyzed, and the DRL is introduced to establish a charging navigation model. The real-time information of ‘vehicles-stations-networks’ is regarded as the state space of deep Q network(DQN), and the allocation of charging station is taken as the execution action of agents. Based on the evaluation of travel cost and time decision-making objectives in different periods of charging process, the reward functions of enroute and arrival a station is determined. The optimal action value function corresponding to the highest reward is implemented to recommend the optimal charging station and plan for the driving path for the owner. Finally, a multi-scene simulation example is designed to verify the feasibility and effectiveness of the strategy proposed in this paper.

关 键 词:电动汽车 充电导航 路径规划 信息交互 深度强化学习 

分 类 号:TM73[电气工程—电力系统及自动化]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象