检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张建军[1] 杨云丹 周一卓 ZHANG Jianjun;YANG Yundan;ZHOU Yizhuo(School of Economics and Management,Tongji University,Shanghai 200092,China)
出 处:《上海管理科学》2025年第2期109-117,共9页Shanghai Management Science
基 金:国家自然科学基金项目(M-0310);上海市软科学重点课题(23692109300);上海市社科规划课题(2022ZGL011)。
摘 要:当重大突发事件发生后,救援组织如何高效地分配有限的人道主义援助物资,在满足受灾区域物资需求的同时又能降低灾民的痛苦,是一项重要的研究课题。针对这一问题,本文建模了适配的混合非整数线性规划问题MINLP,涉及多期动态最优化分配策略求解。作为当前策略探索问题的两种主流方法之一的强化学习算法,通过与环境的交互获取反馈信号以调整策略从而自适应外部动态变化,扩展性极强,比针对特定状态求解的启发式算法更适合动态物资分配场景,由此采取Dueling DQN算法求解最优策略,规避了以往强化学习用于人道主义物资分配领域中存在的Q值过高估计缺点,更精准地求出受灾区域的动作价值函数。与此同时,本文构建需求随机化假设,这一创新使得模型构造更符合受灾场景实际情况,模型的有效性、真实性得以提升。本文以雅安地震为背景,利用数值算例验证了算法的效能,是首篇代入真实数据源佐证强化学习优化应急物资分配方案的论文:相对于传统的DQN方法,Dueling DQN算法能够降低总成本约5%,这意味着在确保物资供给的同时更有效减少了受灾人群的痛苦,彰显了我国“以人为本”的救援原则,在基于人道主义的应急救援方面具备重要的理论和实践意义。The efficient allocation of limited humanitarian aid supplies following major emergencies is a critical research topic,aiming to meet the material needs of affected areas while reducing the suffering of disaster victims.This paper addresses this issue by modeling a Mixed Integer Nonlinear Program-ming(MINLP)problem,which involves solving multi-period dynamic optimization allocation strate-gies.Reinforcement Learning(RL),as one of the two mainstream methods for current strategy explo-ration,is particularly suitable for dynamic resource allocation scenarios due to its strong scalability and adaptability to external dynamics through interaction with the environment and feedback signals.We employ the Dueling DQN algorithm to solve for the optimal policy,overcoming the overestimation of Q-values that has been a drawback in previous RL applications to humanitarian aid distribution.This approach more accurately estimates the action-value function for affected regions.Additionally,the pa-per introduces a novel stochastic demand assumption,enhancing the model’s realism and validity by better reflecting the actual conditions of disaster scenarios.The effectiveness of the proposed method is demonstrated using a numerical example based on the Ya’an earthquake,making this the first study to substantiate the optimization of emergency resource allocation using real data sources with RL.Comparative analysis shows that the Dueling DQN algorithm reduces the total cost by approximately 5%compared to traditional DQN methods,indicating a more effective re-duction in the suffering of affected populations.This aligns with the“people-oriented”rescue prin-ciple of China and holds significant theoretical and practical implications for humanitarian-based emergency responses.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.200