面向隐私安全的联邦决策树算法  被引量:13

Federated Decision Tree Algorithm for Privacy Security

在线阅读下载全文

作  者:郭艳卿[1] 王鑫磊 付海燕[1] 刘航[1] 姚明 GUO Yan-Qing;WANG Xin-Lei;FU Hai-Yan;LIU Hang;YAO Ming(School of Information and Communication Engineering,Dalian University of Technology,Dalian,Liaoning 116024;Data Intelligence Department of InsightOne Tech Co,Ltd,Beijing 100007)

机构地区:[1]大连理工大学信息与通信工程学院,辽宁大连116024 [2]深圳市洞见智慧科技有限公司数据智能部,北京100007

出  处:《计算机学报》2021年第10期2090-2103,共14页Chinese Journal of Computers

基  金:国家自然科学基金(No.62076052,No.U1736119);中央高校基本科研业务费(No.DUT20TD110,No.DUT20RC(3)088)资助.

摘  要:根据用户信息进行资质审查是金融领域的一项重要业务,银行等机构由于用户数据不足和隐私安全等原因,无法训练高性能的违约风险评估模型,从而无法对用户进行精准预测.因此,为了解决数据不共享情况下的联合建模问题,本文提出一种基于联邦学习的决策树算法FL-DT(Federated Learning-Decision Tree).首先,构造基于直方图的数据存储结构用于通信传输,通过减少通信次数,有效提升训练效率;其次,提出基于不经意传输的混淆布隆过滤器进行隐私集合求交,得到包含各参与方数据信息的联邦直方图,并建立联邦决策树模型.最后,提出多方协作预测算法,提升了FL-DT的预测效率.在四个常用的金融数据集上,评估了FL-DT算法的精确性和有效性.实验结果表明,FL-DT算法的准确率比仅利用本地数据建立模型的准确率高,逼近于数据集中情况下模型的准确率,而且优于其他联邦学习方法.另外,FL-DT的训练效率也优于已有算法.In recent years,with the vigorous development of technology and its related industries,Internet finance has increasingly highlighted its advantages.For a long time,qualification review based on the user information has been a fairly important business in the financial field.In most cases,when an individual applies for a loan from a bank,the bank will evaluate him or her through the actual situation based on the established predictive model to determine whether to grant the loan.In this process,a high-quality default risk assessment can avoid unnecessary losses for the banks.However,there are still many deficiencies in the current research on the assessment of default risks of borrowers by banks and other lending institutions.On the one hand,it is difficult to build a high-quality prediction model due to the lack of user data;on the other hand,people are paying more and more attention to the privacy protection of personal data,it is also tough work for banks to obtain a large amount of relative data,and because of that,they cannot carry out the prediction models to accurately predict users’situation.In order to solve the problem of joint modeling in the case of data is not shared,this paper introduces the idea of thefederated learning to effectively utilize the value of other participants’data without the leaving of local data to establish a shared predictive model.Because decision tree algorithms are widely used in financial risk controlling and fraud identification,this paper proposes a decision tree algorithm FL-DT(Federated Learning-Decision Tree)based on federated learning.Federated learning is the concept put forward by Google in 2016,which can complete joint modeling without data sharing.Specifically,the data of each owner will not leave the local place,and the global sharing model will be jointly established through the parameter exchange method under the encryption mechanism in the federal system(in the case of not violating data privacy protection regulations).Moreover,each participant only serves fo

关 键 词:联邦学习 决策树 混淆布隆过滤器 隐私安全 数据不共享 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象