检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:顾永跟[1] 高凌轩 吴小红[1] 陶杰[1] GU Yonggen;GAO Lingxuan;WU Xiaohong;TAO Jie(School of Information Engineering,Huzhou University,Huzhou 313000,Zhejiang,China)
机构地区:[1]湖州师范学院信息工程学院,浙江湖州313000
出 处:《计算机工程》2024年第6期188-196,共9页Computer Engineering
基 金:浙江省现代农业资源智慧管理与应用研究重点实验室项目(2020E10017)。
摘 要:联邦学习作为一种保护本地数据隐私安全的分布式机器学习方法,联合分散的设备共同训练共享模型。通常联邦学习在数据均有标签情况下进行训练,然而现实中无法保证标签数据完全存在,提出联邦半监督学习。在联邦半监督学习中,如何利用无标签数据提升系统性能和如何缓解数据异质性带来的负面影响是两大挑战。针对标签数据仅在服务器场景,基于分享的思想,设计一种可应用在联邦半监督学习系统上的方法Share&Mark,该方法将客户端的分享数据由专家标记后参与联邦训练。同时,为充分利用分享的数据,根据各客户端模型在服务器数据集上的损失值动态调整各客户端模型在联邦聚合时的占比,即ServerLoss聚合算法。综合考虑隐私牺牲、通信开销以及人工标注成本3个方面的因素,对不同分享率下的实验结果进行分析,结果表明,约3%的数据分享比例能平衡各方面因素。此时,采用Share&Mark方法的联邦半监督学习系统FedMatch在CIFAR-10和Fashion-MNIST数据集上训练的模型准确率均可提升8%以上,并具有较优的鲁棒性。Federated Learning(FL)is a distributed machine-learning method that protects the privacy and security of local data by training a shared model on decentralized devices.Typically,FL is performed when all data are labeled.However,in reality,the availability of labeled data is not always guaranteed.Therefore,Federated Semi-Supervised Learning(FSSL)has been proposed.In FSSL,there are two major challenges:utilizing unlabeled data to improve system performance and mitigating the negative effects of data heterogeneity.To address the scenario in which labeled data exist only on the server,a method called Share&Mark is designed based on the concept of sharing.This method can be applied to FSSL systems.Share&Mark involves having experts annotate the shared data from client devices,which then participate in federated training.In addition,to leverage the shared data fully,the ServerLoss aggregation algorithm dynamically adjusts the proportions of the client models during federated aggregation based on their respective loss values on the server dataset.Considering privacy sacrifices,communication costs,and manual annotation costs,the experimental results for different sharing ratios are analyzed.It is found that a sharing ratio of approximately 3%is a balanced choice considering all factors.With Share&Mark method,the FSSL system called FedMatch achieves an accuracy improvement of more than 8%on the CIFAR-10 and Fashion-MNIST datasets.It also demonstrates high robustness.
关 键 词:联邦半监督学习 联邦学习 数据非独立同分布 鲁棒性 聚合算法 数据分享
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.44