检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]广西大学计算机与电子信息学院,广西南宁530004
出 处:《计算机工程与设计》2014年第8期2660-2665,共6页Computer Engineering and Design
基 金:广西自然科学基金项目(2011GXNSFA018152);广西研究生教育创新计划基金项目(YCSZ2012007)
摘 要:充分考虑记录之间的距离以及记录本身权重值对聚类种子选择的影响,建立聚类种子选取模型,以获得更好的聚类结果。提出基于改进变长聚类的多敏感属性概率k-匿名算法,以提高数据可用性;提出融合k-means与改进变长聚类算法的概率k-匿名算法,采用多线程并行技术,在不降低信息损失度与匿名质量的前提下,提升处理大数据集的效率。实验结果表明,所提算法效率较高,其生成的匿名数据集具有较好的数据可用性。To get better clustering results, a model of selecting cluster seeds was created by considering the influence of distances between any two records and effects of the weight of a record on selecting clustering seeds. To achieve the better serviceability of the data set, a probabilistic κ-anonymity algorithm with multiple sensitive attributes was proposed based on the variable-length clustering. Furthermore, a probahilistic κ-anonymity algorithm combining of the κ-means and the variable-length clustering was presented. Under the condition of without reducing the information loss and the data quality, the algorithm improved the efficiency of processing big data set by applying the parallel multiple threads. The experimental results show that the proposed algorithms are efficient and their generated data set have high serviceability.
关 键 词:数据发布 多敏感属性 概率κ-匿名 变长聚类 并行线程
分 类 号:TP309.2[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.158