检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张云飞[1,2] 王宏伦 张梦华[1,2] 巩轶男 ZHANG Yunfei;WANG Honglun;ZHANG Menghua;GONG Yinan(School of Automation Science and Electrical Engineering,Beihang University,Beijing 100191,China;The Science and Technology on Aircraft Control Laboratory,Beihang University,Beijing 100191,China;Hiwing Aviation General Equipment Co.,Ltd.,Beijing 100074,China)
机构地区:[1]北京航空航天大学自动化科学与电气工程学院,北京100191 [2]北京航空航天大学飞行器控制一体化技术国防科技重点实验室,北京100191 [3]海鹰航空通用装备有限责任公司,北京100074
出 处:《西北工业大学学报》2025年第1期128-139,共12页Journal of Northwestern Polytechnical University
摘 要:针对无人机动态滑翔问题,提出了一种基于深度强化学习的航迹优化方法。该方法综合利用梯度风能和太阳能,引入了障碍物约束以模拟复杂障碍环境。使用神经网络近似逼近高斯伪谱方法求解航迹的策略,在训练得到的策略基础上利用双延迟深度确定性策略梯度算法进行策略改进,在大幅度提升推理实时性的同时解决了传统最优控制算法在动态滑翔领域难以应对变化风场的问题。实验针对动态滑翔2种经典模式进行仿真验证,之后在考虑多种能量源的情况下进行蒙特卡洛仿真。结果表明,基于深度强化学习的动态滑翔航迹优化方法在单个滑翔周期内获能与最优结果相近,而实时推理决策时间减少了91%。在变化风场环境下,文中方法相较于传统方法具有更强的适应性。In addressing the issue of dynamic soaring in unmanned aerial vehicles,a trajectory optimization approach based on deep reinforcement learning is proposed.This method synergistically utilizes gradient wind energy and solar energy and incorporates obstacle constraints to simulate complex barrier environments.It employs neural networks to approximate the Gaussian pseudospectral method for solving trajectory policies.On the foundation of the trained policies,the method utilizes the twin delayed deep deterministic policy gradient algorithm for policy enhancement.This significantly boosts the real-time inference capabilities while addressing the challenges traditional optimal control algorithms face in dynamic soaring due to varying wind fields.The experiments initially validate the approach through simulation of two classic modes of dynamic soaring,followed by Monte Carlo simulations considering multiple energy sources.The results indicate that the dynamic soaring trajectory optimization method based on deep reinforcement learning achieves energy acquisition comparable to optimal outcomes within a single soaring cycle,with a 91%reduction in real-time inference decision time.Moreover,in changing wind field environments,this method demonstrates superior adaptability compared to traditional approaches.
分 类 号:V249.1[航空宇航科学与技术—飞行器设计]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49