Bidirectional Q-learning for recycling path planning of used appliances under strong and weak constraints

作　　者：Yang Qi Jinxin Cao Baijing Wu

机构地区：[1]Institute of Transportation Engineering,Inner Mongolia University,Hohhot,010010,China [2]Inner Mongolia Academy of Science and Technology,Hohhot,010010,China [3]Institute of Electronics and Information Engineering,Lanzhou Jiaotong University,Lanzhou,730070,China

出　　处：《Communications in Transportation Research》2024年第1期451-460,共10页交通研究通讯(英文)

基　　金：National Natural Science Foundation of China(Grant Nos.72461028,71961024,72161032,72061028,and 71971022);Key Technology Research Plan of Inner Mongolia Autonomous Region(Grant No.2019GG287).

摘　　要：With the continuous innovation in household appliance technology and the improvement of living standards,the production of discarded household appliances has rapidly increased,making their recycling increasingly significant.Traditional path planning algorithms encounter difficulties in balancing efficiency and constraints in addressing the multi-objective,multi-constraint challenge posed by discarded household appliance recycling routes.To tackle this issue,this study introduces a bi-directional Q-learning-based path planning algorithm.By developing a bi-directional Q-learning mechanism and enhancing the initialization method of Q-learning,the algorithm aims to achieve efficient and effective optimization of discarded household appliance recycling routes.It implements bidirectional updates of the state-action value function from both the starting point and the target point.Additionally,a hierarchical reinforcement learning strategy and guided rewards are introduced to minimize blind exploration and expedite convergence.By decomposing complex recycling tasks into multiple sub-tasks and seeking paths with superior performance at each sub-task level,the initial exploratory blindness is reduced.To validate the efficacy of the proposed algorithm,gridbased modeling of real-world environments is utilized.Comparative experiments reveal significant improvements in iteration counts and path lengths,thereby validating its practical applicability in path planning for recycling initiatives.

关键词：Path planning Q-LEARNING Waste electrical recovery Reinforcement learning Reward function

分类号：TN9[电子电信—信息与通信工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

Bidirectional Q-learning for recycling path planning of used appliances under strong and weak constraints

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

Bidirectional Q-learning for recycling path planning of used appliances under strong and weak constraints

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索