基于变长聚类的多敏感属性概率k-匿名算法  被引量:4

Probabilistic k-anonymity algorithm with multi-sensitive attributes based on variable-length clustering

在线阅读下载全文

作  者:唐印浒[1] 钟诚[1] 

机构地区:[1]广西大学计算机与电子信息学院,广西南宁530004

出  处:《计算机工程与设计》2014年第8期2660-2665,共6页Computer Engineering and Design

基  金:广西自然科学基金项目(2011GXNSFA018152);广西研究生教育创新计划基金项目(YCSZ2012007)

摘  要:充分考虑记录之间的距离以及记录本身权重值对聚类种子选择的影响,建立聚类种子选取模型,以获得更好的聚类结果。提出基于改进变长聚类的多敏感属性概率k-匿名算法,以提高数据可用性;提出融合k-means与改进变长聚类算法的概率k-匿名算法,采用多线程并行技术,在不降低信息损失度与匿名质量的前提下,提升处理大数据集的效率。实验结果表明,所提算法效率较高,其生成的匿名数据集具有较好的数据可用性。To get better clustering results, a model of selecting cluster seeds was created by considering the influence of distances between any two records and effects of the weight of a record on selecting clustering seeds. To achieve the better serviceability of the data set, a probabilistic κ-anonymity algorithm with multiple sensitive attributes was proposed based on the variable-length clustering. Furthermore, a probahilistic κ-anonymity algorithm combining of the κ-means and the variable-length clustering was presented. Under the condition of without reducing the information loss and the data quality, the algorithm improved the efficiency of processing big data set by applying the parallel multiple threads. The experimental results show that the proposed algorithms are efficient and their generated data set have high serviceability.

关 键 词:数据发布 多敏感属性 概率κ-匿名 变长聚类 并行线程 

分 类 号:TP309.2[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象