基于差分隐私的频繁项集挖掘的启发式算法  被引量:4

Heuristic algorithm for mining frequent itemsets based on difference privacy

在线阅读下载全文

作  者:陈婷婷[1,2] 龙士工 CHEN Ting-ting;LONG Shi-gong(Guizhou Provincial Key Laboratory of Public Big Data,Guizhou University,Guiyang 550000,China;College of Computer Science and Technology,Guizhou University,Guiyang 550000,China)

机构地区:[1]贵州大学贵州省公共大数据重点实验室,贵州贵阳550000 [2]贵州大学计算机科学与技术学院,贵州贵阳550000

出  处:《计算机工程与设计》2019年第2期412-417,共6页Computer Engineering and Design

基  金:贵州省公共大数据重点实验室开放课题基金项目(2017001)

摘  要:针对频繁项目集挖掘结果直接发布可能会造成严重的个人隐私泄露,提出一种满足差分隐私的频繁项目集挖掘算法。为降低差分隐私的全局敏感度,根据候选项的覆盖分数和项集与事务距离两个指标,采用启发式截断算法进行事务截断,尽可能多地使截断后的事务保留原事务的频繁项信息。采用最大支持度估计策略生成候选项集,降低因事务截断和剪枝操作带来的误差。实验结果对比分析表明,提出算法满足差分隐私保护,挖掘的频繁项集具有较好的效用。Aiming at the problem that the direct publishing of frequent itemsets mining results may cause serious personal privacy leaks,a frequent itemsets mining algorithm that satisfied differential privacy was proposed.To reduce the global sensitivity of differential privacy,according to the cover score of the candidate and the distance between an itemset and a sub-transaction,the heuristic truncation algorithm was used to cut off the transaction,keeping the truncated transaction maintain the frequent item information of the original transaction as much as possible.Candidate sets were generated using the maximum support estimation stra-tegy to reduce the error caused by transaction truncation and pruning operation.Experimental results are compared and analyzed,the proposed algorithm satisfies the differential privacy protection,and the frequent itemsets mined have better utility.

关 键 词:差分隐私 频繁项目集 启发式截断 覆盖分数 项集与事务距离 最大支持度估计策略 

分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象