基于逆向强化学习的装船时堆场翻箱智能决策被引量：7

An Inverse Reinforcement Learning Method for Container Relocation in Container Terminal Yard During Loading

作　　者：张艳伟[1] 蔡梦蝶 ZHANG Yanwei;CAI Mengdie(School of Transportation and Logistics Engineering,Wuhan University of Technology,Wuhan 430063,China)

机构地区：[1]武汉理工大学交通与物流工程学院,湖北武汉430063

出　　处：《同济大学学报（自然科学版）》2021年第10期1417-1425,共9页Journal of Tongji University:Natural Science

基　　金：国家自然科学基金(60904067)。

摘　　要：集装箱码头装船时堆场翻箱具有时序性与动态性,属于NP(non-deterministic polynomial)难问题。针对常见的顺岸式集装箱码头堆场,以最小化总翻箱次数为优化目标,考虑翻箱对装船连续性及效率的影响,基于马尔科夫决策过程构建装船时堆场翻箱模型,设计逆向强化学习算法。为验证算法的有效性,以随机决策为基准,将设计的逆向强化学习算法与码头常见规则决策、随机决策对比。结果表明,贝位堆存状态不佳时,常见的规则决策不一定优于随机决策;逆向强化学习算法可有效挖掘隐含专家经验,收敛至最小翻箱次数的概率更高,且不同堆存状态下均能更好地限制单次发箱的翻箱次数,可实现装船时堆场翻箱智能决策。The container relocation during loading in the terminal yard has sequential and dynamic characteristics,and belongs to the non-deterministic polynomial hard problem.This paper takes the common container terminal yard,which is parallel to the shoreline,as the research object.Considering the relocation effect on the continuity and efficiency of shipment,the model based on Markov decision processes for the container relocation in the yard during loading was proposed,with the optimization objective to minimize the total relocation times,and the algorithm based on inverse reinforcement learning was designed.To verify the effectiveness of the algorithm,taking the random decision as criterion,the inverse reinforcement learning algorithm was compared with the common rule decision-making and the random decisionmaking.The results show that when the initial state of the bay is unsatisfactory,the common rule decision-making is not necessarily superior to random decision-making.The inverse reinforcement learning algorithm can effectively mine and apply the expert experience,and the probability of converging to the minimum relocation times is obviously better than that of the others.In addition,it can better control the relocation times of a single loading in different state of the bay,and realize the intelligent decision-making of container relocation during loading.

关键词：集装箱码头堆场翻箱智能决策马尔科夫决策过程逆向强化学习

分类号：U695.22[交通运输工程—港口、海岸及近海工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于逆向强化学习的装船时堆场翻箱智能决策被引量：7

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于逆向强化学习的装船时堆场翻箱智能决策 被引量：7

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于逆向强化学习的装船时堆场翻箱智能决策被引量：7