检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:ZHANG RuiXian YANG JiaNan LIANG Ye LU ShengAo DONG YiFei YANG BaoQing ZHANG LiXian
机构地区:[1]School of Astronautics,Harbin Institute of Technology,Harbin 150001,China
出 处:《Science China(Technological Sciences)》2024年第2期423-434,共12页中国科学(技术科学英文版)
基 金:supported by the National Natural Science Foundation of China(Grant Nos.62225305 and 12072088);the Fundamental Research Funds for the Central Universities,China(Grant Nos.HIT.OCEF.2022047,HIT.BRET.2022004 and HIT.DZIJ.2023049);the Grant JCKY2022603C016,State Key Laboratory of Robotics and System(HIT);the Heilongjiang Touyan Team。
摘 要:This paper investigates the navigation problem of autonomous vehicles based on reinforcement learning(RL)with both stability and smoothness guarantees.By introducing a data-based Lyapunov function,the stability criterion in mean cost is obtained,where the Lyapunov function has a property of fast descending.Then,an off-policy RL algorithm is proposed to train safe policies,in which a more strict constraint is exerted in the framework of model-free RL to ensure the fast convergence of policy generation,in contrast with the existing RL merely with stability guarantee.In addition,by simultaneously introducing constraints on action increments and action distribution variations,the difference between the adjacent actions is effectively alleviated to ensure the smoothness of the obtained policy,instead of only seeking the similarity of the distributions of adjacent actions as commonly done in the past literature.A navigation task of a ground differentially driven mobile vehicle in simulations is adopted to demonstrate the superiority of the proposed algorithm on the fast stability and smoothness.
关 键 词:autonomous vehicles NAVIGATION reinforcement learning SMOOTHNESS stability
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.205