检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:黄旭 柳嘉润[1,2] 骆无意 Huang Xu;Liu Jiarun;Luo Wuyi(Beijing Aerospace Automatic Control Institution,Beijing 100854,China;National Key Laboratory of Science and Technology on Aerospace Intelligent Control,Beijing 100854,China)
机构地区:[1]北京航天自动控制研究所,北京100854 [2]宇航智能控制技术国家级重点实验室,北京100854
出 处:《航天控制》2020年第4期3-8,共6页Aerospace Control
摘 要:探索了利用深度强化学习算法训练智能体,以代替人类工程师进行火箭姿态控制器参数的离线设计方案。建立了多特征秒的火箭频域分析模型,选定了设计参数。选择深度强化学习算法中的双深度Q学习(Double Deep Q Network,DDQN)算法,通过记忆回放和时间差分迭代的方式让智能体在与环境交互过程中不断学习。设计了对应的马尔科夫决策过程模型,进行了智能体的训练和前向测试。结果说明该方法对于运载火箭姿控设计具有一定参考价值。In this paper,the off-line design scheme of rocket attitude controller parameters using deep reinforcement learning algorithm to train an agent instead of human engineers is studied. Firstly,a multicharacteristic-second rocket frequency domain analysis model is established and the design parameters are selected. Then,the double deep Q network( DDQN) algorithm is selected as the training algorithm. The agent is allowed to continuously learn during the interaction with the environment through memory playback and time differential iteration in this algorithm. Meanwhile,the Markov decision process of the problem is designed,and the agent training and testing are implemented. The results show that the method has certain reference value for the attitude control design of the rocket.
分 类 号:V448.1[航空宇航科学与技术—飞行器设计]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.219.218.77