数据发布中面向多敏感属性的隐私保护方法  被引量:59

Privacy Preserving Approaches for Multiple Sensitive Attributes in Data Publishing

在线阅读下载全文

作  者:杨晓春[1] 王雅哲[1] 王斌[1] 于戈[1] 

机构地区:[1]东北大学信息科学与工程学院,沈阳110004

出  处:《计算机学报》2008年第4期574-587,共14页Chinese Journal of Computers

基  金:新世纪优秀人才支持计划(NCET-06-0290);国家自然科学基金(60503036);霍英东教育基金会青年教师优选资助课题(104027)资助~~

摘  要:现有的隐私数据发布技术通常关注单敏感属性数据,直接应用于多敏感属性数据会导致大量隐私信息的泄漏.文中首次对多敏感属性数据发布问题进行详细研究,继承了基于有损连接对隐私数据进行保护的思想,提出了针对多敏感属性隐私数据发布的多维桶分组技术——MSB(Multi-Sensitive Bucketization).为了避免高复杂性的穷举方法,首先提出3种不同的线性时间的贪心算法:最大桶优先算法(MBF)、最大单维容量优先算法(MSDCF)和最大多维容量优先算法(MMDCF).另外,针对实际应用中发布数据的重要性差异,提出加权多维桶分组技术.实际数据集上的大量实验结果表明,所提出的前3种算法的附加信息损失度为0.04,而隐匿率都低于0.06.加权多维桶分组技术对数据拥有者定义的重要信息的可发布性达到70%以上.Current privacy preserving data publishing techniques concentrate on tables with only one sensitive attribute. However, most of the real-world applications contain multilple sensitive attributes. Directly applying the existing single-sensitive-attribute privacy preserving techniques often causes unexpected private information disclosure. This paper firstly discusses the problem of secure publishing data when sensitive data contains multi attributes, and then propose a multidimensional bucket grouping approach on the idea of lossy join, called Multi-Sensitive Bucketization (MSB). In order to avoid exhausting search, three specific line-time greedy based MSB algorithms are proposed, which are maximal-bucket first algorithm (MBF), maximal single-dimen- sion-capacity first algorithm (MSDCF), and maximal multi-dimension-capacity first algorithm (MMDCF). In addition, according to the differences among published data, a weighted MSB approach is further proposed. Experimental results on the real-world datasets show that the addition information loss of the proposed MSB methods were not more than 0.04 and the suppression ratios were less than 0.06. The weighted MSB approach can guarantee more than 70% publishing ratio.

关 键 词:数据发布 数据隐私 多敏感属性 有损连接 l-多样性 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象