检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:Wentao Liu Xiaolong Xu Jintao Wu Jielin Jiang
机构地区:[1]School of Computer Science,Nanjing University of Information Science and Technology,Nanjing 210044,China [2]School of Software,Nanjing University of Information Science and Technology,Nanjing 210044,China
出 处:《Tsinghua Science and Technology》2024年第3期911-926,共16页清华大学学报(自然科学版(英文版)
基 金:supported by the National Key Research and Development Program of China(No.2020YFB1707601);the Major Research Plan of the National Natural Science Foundation of China(No.92267104);the Natural Science Foundation of Jiangsu Province of China(No.BK20211284);the Financial and Science Technology Plan Project of Xinjiang Production and Construction Corps(No.2020DB005).
摘 要:As an emerging privacy-preservation machine learning framework,Federated Learning(FL)facilitates different clients to train a shared model collaboratively through exchanging and aggregating model parameters while raw data are kept local and private.When this learning framework is applied to Deep Reinforcement Learning(DRL),the resultant Federated Reinforcement Learning(FRL)can circumvent the heavy data sampling required in conventional DRL and benefit from diversified training data,besides privacy preservation offered by FL.Existing FRL implementations presuppose that clients have compatible tasks which a single global model can cover.In practice,however,clients usually have incompatible(different but still similar)personalized tasks,which we called task shift.It may severely hinder the implementation of FRL for practical applications.In this paper,we propose a Federated Meta Reinforcement Learning(FMRL)framework by integrating Model-Agnostic Meta-Learning(MAML)and FRL.Specifically,we innovatively utilize Proximal Policy Optimization(PPO)to fulfil multi-step local training with a single round of sampling.Moreover,considering the sensitivity of learning rate selection in FRL,we reconstruct the aggregation optimizer with the Federated version of Adam(Fed-Adam)on the server side.The experiments demonstrate that,in different environments,FMRL outperforms other FL methods with high training efficiency brought by Fed-Adam.
关 键 词:federated learning reinforcement learning META-LEARNING PERSONALIZATION
分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.140.250.157