检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:蒋立伟 叶永钢 周波 肖文超 周明军 彭庭锋 Jiang Li-wei;Ye Yong-gang;Zhou Bo;Xiao Wen-chao;Zhou Ming-jun;Peng Ting-feng(Technology Center,Wuhan Bus Manufacturing Co.,Ltd.,Hubei Wuhan 430200)
机构地区:[1]武汉客车制造股份有限公司技术中心,湖北武汉430200
出 处:《内燃机与配件》2024年第22期1-5,共5页Internal Combustion Engine & Parts
摘 要:现有自动驾驶技术多集中于安全性,而忽视经济性。针对此问题,本文提出一种基于改进的深度确定性策略梯度(Deep Deterministic Policy Gradient,DDPG)算法的经济性自动驾驶决策方法。首先,分析汽车驾驶功耗因素,以汽车的电机效率为指标,设计经济性奖励函数,以优化汽车行驶效率;其次,通过引入专家经验指导和双经验池动态回放策略,灵活且高效利用经验池数据,提高模型收敛速度和稳定性;同时,改进在线价值网络,设计双在线价值网络,从而降低对策略价值的过高估计。最后在CARLA中搭建仿真环境对所提算法进行验证,结果表明,改进后的算法在累计奖励、收敛速度和稳定性等多方面均优于原始DDPG算法,有效提升了自动驾驶汽车的经济性和能效。The existing autonomous driving technology mostly focuses on safety,but neglects its economy.To address this issue,this article proposes an economic autonomous driving decision-making method based on an improved Deep Deterministic Policy Gradient(DDPG)algorithm.This article first analyzes the power consumption factors of car driving.Design an economic reward function based on the motor efficiency of the car to optimize its driving efficiency.Secondly,by introducing expert experience guidance and a dual experience pool dynamic replay strategy,the model can flexibly and efficiently utilize experience pool data to improve convergence speed and stability.Meanwhile,improving the online critic network and designing a dual online critic network can reduce overestimation of Q value.Finally,a simulation environment is built in CARLA to verify the proposed algorithm.The results show that the improved algorithm is superior to the original DDPG algorithm in terms of cumulative rewards,convergence speed and stability,and effectively improves the economy and energy efficiency of autonomous vehicle.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.236