Differential identifiability clustering algorithms for big data analysis  被引量:1

在线阅读下载全文

作  者:Tao SHANG Zheng ZHAO Xujie REN Jianwei LIU 

机构地区:[1]School of Cyber Science and Technology,Beihang University,Beijing 100083,China [2]School of Electronic and Information Engineering,Beihang University,Beijing 100083,China

出  处:《Science China(Information Sciences)》2021年第5期45-62,共18页中国科学(信息科学)(英文版)

基  金:supported by National Key Research and Development Program of China(Grant No.2016YFC1000307);National Natural Science Foundation of China(Grant Nos.61971021,61571024)。

摘  要:Individual privacy preservation has become an important issue with the development of big data technology.The definition of ρ-differential identifiability(DI)precisely matches the legal definitions of privacy,which can provide an easy parameterization approach for practitioners so that they can set privacy parameters based on the privacy concept of individual identifiability.However,differential identifiability is currently only applied to some simple queries and achieved by Laplace mechanism,which cannot satisfy complex privacy preservation issues in big data analysis.In this paper,we propose a new exponential mechanism and composition properties of differential identifiability,and then apply differential identifiability to k-means and k-prototypes algorithms on MapReduce framework.DI k-means algorithm uses the usual Laplace mechanism and composition properties for numerical databases,while DI k-prototypes algorithm uses the new exponential mechanism and composition properties for mixed databases.The experimental results show that both DI k-means and DI k-prototypes algorithms satisfy differential identifiability.

关 键 词:differential identifiability differential privacy K-MEANS k-prototypes big data 

分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象