自适应聚类中心个数选择:一种联邦学习的隐私效用平衡方法  

Adaptive Clustering Center Selection:A Privacy Utility Balancing Method for Federated Learning

在线阅读下载全文

作  者:宁博[1] 宁一鸣 杨超 周新 李冠宇[1] 马茜 NING Bo;NING Yi ming;YANG Chao;ZHOU Xin;LI Guan yu;MA Qian(School of Information Science and Technology,Dalian Maritime University,Dalian 116026,China;Information and Communication Branch of State Grid Liaoning Electric Power Co.,Ltd.,Shenyang 110000,China)

机构地区:[1]大连海事大学信息科学技术学院,大连116026 [2]国网辽宁省电力有限公司信息通信分公司,沈阳110000

出  处:《电子与信息学报》2025年第2期519-529,共11页Journal of Electronics & Information Technology

基  金:国家自然科学基金(61976032,62002039)。

摘  要:联邦学习是一种分布式机器学习方法,它使多个设备或节点能够协作训练模型,同时保持数据的本地性。但由于联邦学习是由不同方拥有的数据集进行模型训练,敏感数据可能会被泄露。为了改善上述问题,已有相关工作在联邦学习中应用差分隐私对梯度数据添加噪声。然而在采用了相应的隐私技术来降低敏感数据泄露风险的同时,模型精度和效果因为噪声大小的不同也受到了部分影响。为解决此问题,该文提出一种自适应聚类中心个数选择机制(DP-Fed-Adap),根据训练轮次和梯度的变化动态地改变聚类中心个数,使模型可以在保持相同性能水平的同时确保对敏感数据的保护。实验表明,在使用相同的隐私预算前提下DP-Fed-Adap与添加了差分隐私的联邦相似算法(FedSim)和联邦平均算法(FedAvg)相比,具有更好的模型性能和隐私保护效果。Objective Differential privacy,based on strict statistical models,is widely applied in federated learning.The common approach integrates privacy protection by perturbing parameters during local model training and global model aggregation to safeguard user privacy while maintaining model performance.A key challenge is minimizing performance degradation while ensuring strong privacy protection.Currently,an issue arises in early-stage training,where data gradient directions are highly dispersed.Directly applying initial data calculations and processing at this stage can reduce the accuracy of the global model.Methods To address this issue,this study introduces a differential privacy mechanism in federated learning to protect individual privacy while clustering gradient information from multiple data owners.During gradient clustering,the number of clustering centers is dynamically adjusted based on training epochs,with the rate of change in clusters aligned with the model training process.In the early stages,higher noise levels are introduced to enhance privacy protection.As the model converges,noise is gradually reduced to improve learning of the true data distribution.Result and discussions The first set of experimental results(Fig.3)shows that different fixed numbers of cluster centers lead to varying rates of change in training accuracy during the early and late stages of the training cycle.This suggests that reducing the number of cluster centers as training progresses benefits model performance,and the segmentation function is selected based on these findings.The second set of experiments(Fig.4)indicates that among four sets of model performance comparisons,our method achieves the highest accuracy in the later stages of training as the number of rounds increases.This demonstrates that adjusting the number of cluster centers during training has a measurable effect.As model training concludes,gradient directions tend to converge,and reducing the number of cluster centers improves accuracy.The performance comparis

关 键 词:联邦学习 差分隐私保护 梯度聚类 自适应选择 

分 类 号:TN919[电子电信—通信与信息系统] TP309.2[电子电信—信息与通信工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象