检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:沈国辉 赵荣生 董晓 邢强 陈中[4] 袁浩 耿爱国 刘纪民 SHEN Guohui;ZHAO Rongsheng;DONG Xiao;XING Qiang;CHEN Zhong;YUAN Hao;GENG Aiguo;LIU Jimin(NARI Group Corporation,Nanjing 211106,China;Beijing Kedong Power Control System Corporation,Beijing 100194,China;State Grid Electric Vehicle Service Corporation,Beijing 100053,China;School of Electrical Engineering,Southeast University,Nanjing 210096,China)
机构地区:[1]南瑞集团有限公司,南京211106 [2]北京科东电力控制系统有限责任公司,北京100194 [3]国网电动汽车服务有限公司,北京100053 [4]东南大学电气工程学院,南京210096
出 处:《南方电网技术》2022年第1期108-116,共9页Southern Power System Technology
基 金:国家电网公司总部科技项目(5418-202018247A-0-0-00)。
摘 要:针对电动汽车动态行驶行为和随机充电行为的多信息融合特征以及多系统建模复杂度,提出了一种基于多信息交互与深度强化学习的电动汽车充电导航策略。该策略首先对“电动汽车集群优化储能云平台”采集的电动汽车实际运行数据进行建模与挖掘,通过数据预处理以及数据可视化显示得到电动汽车行驶、充电信息以及城市充电站信息。其次,分析了电动汽车充电调度过程符合马尔科夫决策定义,引入深度强化学习方法建立了充电导航模型。将“车-站-网”实时信息作为深度Q网络算法的状态空间,并将充电站的分配作为智能体的执行动作。通过对充电过程不同时段出行的成本和时间决策目标的评估,确定行驶途中与到站后的奖励函数。执行最高奖励对应的最优动作-值函数,为车主推荐最优充电站和规划行驶路径。最后,设计了多场景仿真算例验证了所提策略的可行性和有效性。For the multi-information fusion feature and multi-system modeling complexity of dynamic driving behavior and random charging behavior for electric vehicles(EVs), a charging navigation strategy of EVs based on multi-information interaction and deep reinforcement learning(DRL) is proposed in this paper. In this strategy, the EV actual operation data collected by ‘optimization energy storage cloud platform of electric vehicle clusters’ is firstly modeled and mined. Through data preprocessing and data visual display, the EV driving and charging information as well as urban charging station information are obtained. Then, the EV charging scheduling process conforming to Markov decision process(MDP) is analyzed, and the DRL is introduced to establish a charging navigation model. The real-time information of ‘vehicles-stations-networks’ is regarded as the state space of deep Q network(DQN), and the allocation of charging station is taken as the execution action of agents. Based on the evaluation of travel cost and time decision-making objectives in different periods of charging process, the reward functions of enroute and arrival a station is determined. The optimal action value function corresponding to the highest reward is implemented to recommend the optimal charging station and plan for the driving path for the owner. Finally, a multi-scene simulation example is designed to verify the feasibility and effectiveness of the strategy proposed in this paper.
关 键 词:电动汽车 充电导航 路径规划 信息交互 深度强化学习
分 类 号:TM73[电气工程—电力系统及自动化]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.249