检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:蔡泽 胡耀光[1] 闻敬谦[1] 张立祥 CAI Ze;HU Yaoguang;WEN Jingqian;ZHANG Lixiang(Laboratory of Industrial and Intelligent System Engineering,Beijing Institute of Technology,Beijing 100081,China)
机构地区:[1]北京理工大学工业与智能系统工程研究所,北京100081
出 处:《计算机集成制造系统》2023年第1期236-245,共10页Computer Integrated Manufacturing Systems
基 金:国家重点研发计划资助项目(2021YFB1715700);国家自然科学基金资助项目(52175451)。
摘 要:为提升自动导引车(AGV)在智能工厂复杂动态环境下的避障能力,使其能在全局路径引导下安全、高效地完成避障任务,提出一种基于深度强化学习的局部避障方法。首先,将避障问题表示为部分观测马尔可夫决策过程,详细描述了观测空间、动作空间、奖励函数和最优避障策略,通过设置不同的奖励实现以全局路径引导局部避障规划;然后,在此基础上,采用深度确定性策略梯度算法训练避障策略;最后,建立了仿真实验环境,并设计多种实验场景来验证所提方法的有效性。实验结果表明,所提方法可以应对复杂动态环境,减小避障时间与距离,提高运行效率。To improve the collision avoidance capability of Automated Guided Vehicles(AGV) in the complex dynamic environment of smart factories, enable them to carry out material handling tasks more safely and efficiently following the global path, a local collision avoidance method based on deep reinforcement learning was proposed. The problem of collision avoidance of AGV was formulated as Partial Observational Markov Decision Process(POMDP) in which observation space, action space and reward function were expatiated. Tracking of the global path was achieved by setting different reward values. Then a Deep Deterministic Policy Gradient(DDPG) method was further implemented to solve collision avoidance policy. The trained policy was validated in various simulated scenarios, and the effectiveness was proved. The experimental results showed the proposed approach could respond to the complex dynamic environment and reduce the time and distance of collision avoidance.
分 类 号:TH166[机械工程—机械制造及自动化]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49