检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张东月 倪巍伟[1,2] 张森[1,2] 付楠[1,2] 候立贺 ZHANG Dong-Yue;NI Wei-Wei;ZHANG Sen;FU Nan;HOU Li-He(Department of Computer Science and Engineering,Southeast University,Nanjing 211189;Key Laboratory of Computer Network and Information Integration in Southeast University,Ministry of Education,Nanjing 211189)
机构地区:[1]东南大学计算机科学与工程学院,南京211189 [2]东南大学计算机网络和信息集成教育部重点实验室,南京211189
出 处:《计算机学报》2023年第2期422-435,共14页Chinese Journal of Computers
基 金:国家自然科学基金(61772131,62072156)资助。
摘 要:随着移动互联网应用的不断深入,产生了大量个体数据,采集分布在不同终端上的数据进行聚类可以发现人群行为模式,支撑应用服务的深入开展.然而这些数据往往包含个体敏感信息,在缺少可信数据采集者的情况下,直接采集数据进行聚类存在泄露个体数据隐私的风险.近年来,本地化差分隐私(Local Differential Privacy,LDP)以其严谨的数学理论基础得到隐私保护领域研究者的持续关注.现有基于LDP的聚类研究多数采用基于划分的聚类方法,存在仅适用凸状分布数据以及聚类质量损失较大问题.针对该问题,聚焦网格聚类,提出基于LDP的隐私保护网格聚类方法.首先,设计网格划分评估指标,通过调节网格划分粒度调控网格密度估算误差和簇边缘信息损失,指导网格结构选取;然后,在服务器与终端间构建循环反馈机制,利用数据分布信息迭代优化扰动粒度,降低差分噪声注入量,在保护终端数据隐私安全的前提下,提升网格密度估算精度;最后,在服务器端,提出基于网格结构的自适应网格聚合方法,提升隐私保护聚类质量.理论分析和实验结果表明,所提方法在兼顾各终端个体数据隐私的同时,对不同分布数据有良好的聚类效果.With the continuous deepening of mobile Internet applications,a large number of individual data have been produced.Collecting data distributed on different terminals for clustering can find the behavior patterns of people and support the in-depth development of application services.However,these data often contain individual sensitive information.Directly collecting data for clustering has the risk of revealing individual data privacy in the case of a lack of trusted data collectors.In recent years,localized differential Privacy(local differential privacy,LDP)has been continuously concerned by researchers in privacy protection because of its rigorous mathematical theory.Most of the existing LDP-based clustering methods use partition-based clustering methods,which are only suitable for convex distribution data and have the problem of large clustering quality loss.To tackle these problems,we focus on grid clustering and propose a local differential privacy based privacy-preserving grid clustering method.Firstly,we design an evaluation index of grid division,which adjusts the grid density estimation error and the loss of cluster edge information to guide the selection of grid structure.Then,we construct a cyclic feedback mechanism between the server and the terminal,which uses data distribution information to iteratively optimize the disturbance granularity,reduce the amount of differential noise injection,and improve the accuracy of grid density estimation.Finally,we propose an adaptive grid aggregation method based on grid structure to improve the accuracy of privacy protection clustering on the server-side.Theoretical analysis and experimental results show that the proposed method considers the privacy of each terminal’s individual data and has a good clustering effect on different distributed data.
关 键 词:隐私保护 本地化差分隐私 网格聚类 网格划分评估指标 循环反馈机制
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.222.251.131