基于马尔可夫聚类的隐私高维数据发布方法  

Private high-dimensional data publication with Markov clustering

在线阅读下载全文

作  者:刘卓群[1,2] 龙士工 张珺铭[1,2,3] 刘光源 LIU Zhuo-qun;LONG Shi-gong;ZHANG Jun-ming;LIU Guang-yuan(State Key Laboratory of Public Big Data,Guizhou University,Guiyang 550025,China;College of Computer Science and Technology,Guizhou University,Guiyang 550025,China;Foundation Department,Guizhou Polytechnic of Construction,Guiyang 551400,China)

机构地区:[1]贵州大学公共大数据国家重点实验室,贵州贵阳550025 [2]贵州大学计算机科学与技术学院,贵州贵阳550025 [3]贵州建设职业技术学院基础部,贵州贵阳551400

出  处:《计算机工程与设计》2025年第1期117-123,共7页Computer Engineering and Design

基  金:国家自然科学基金项目(62062020)。

摘  要:针对现有差分隐私的方法在处理高维数据发布时面临计算成本高、数据精度低和中心服务器不可信任的问题,提出一种基于马尔可夫聚类的隐私高维数据发布方法MCL-LDP。基于在用户本地实现对用户数据的隐私保护,中心服务器接收到用户本地化差分隐私保护的数据后,构建无向依赖图矩阵表示高维数据的复杂的属性关联性,基于马尔可夫聚类将高维数据属性集分割成多个低维属性簇,利用EM算法计算低维属性簇和重叠属性簇的边缘分布、估计原始数据的联合分布,通过采样合成新的数据集进行发布。实验结果表明,所提出方法在发布高维数据集上有较好的精度、较少的迭代次数和较高的计算效率。Aiming at the problems of high computing cost,low data accuracy and untrustworthy central server in the existing differential privacy methods,a privacy high-dimensional data publishing method based on Markov clustering,namely MCL-LDP,was proposed.Based on the realization of privacy protection of user data locally,after receiving the data of localized differential privacy protection of users,the central server constructed an unoriented dependency graph matrix to represent the complex attribute correlation of high-dimensional data.The attribute set of high-dimensional data was split into multiple low-dimensional attribute clusters based on Markov clustering.The EM algorithm was used to calculate the edge distribution of low-dimensional attribute clusters and overlapping attribute clusters to estimate the joint distribution of the original data.A new data set was synthesized by sampling and published.Experimental results show that the proposed method has better accuracy,less iterations and higher computational efficiency on publishing high-dimensional data sets.

关 键 词:高维数据 本地化差分隐私 马尔可夫聚类 数据发布 联合分布估计 属性关联性 数据合成 

分 类 号:TP309[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象