检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]燕山大学信息科学与工程学院,河北秦皇岛066004
出 处:《燕山大学学报》2012年第1期32-38,共7页Journal of Yanshan University
基 金:河北省自然科学基金资助项目(F2011203219)
摘 要:-多样性(I-diversity)模型采用传统基于概念层次结构的数据概化策略,在对敏感属性进行匿名保护时往往会造成不必要的信息损失。针对这一问题,将聚类技术引入数据匿名中,提出一种基于聚类的I-diversity匿名保护方法。该方法在满足I-diversity模型的约束条件下,采用基于距离的层次化聚类算法划分元组,对不同类型的准标识符使用不同的概化策略,并依据数据概化前后属性值不确定性程度的变化描述数据概化带来的信息损失。同现有的I-diversity模型相比,该方法能较好地保护用户的敏感属性,并且在一定程度上降低了概化处理带来的信息损失。The traditional data generalization strategy of l-diversity model is based on the concept-hierarchy structure, but this kind of data generalization strategy may cause some unnecessary loss of information as taking measure of anonymous protection to sensitive attributes. To solve this problem, the technique of cluster for data anonymity is adopted and a corresponding anonymous protection method is proposed. Under the constraint condition of /-diversity model, the new method makes partition oftuple ac- cording to the hierarchical clustering algorithm based on distance, take different generalization strategy for different kinds of iden- tifiers, and describe the loss ofinformation caused by data generalization according to the change of uncertainty degree of attributes. By contrast with the original model, the new model proposed in this paper performs better than in the protection of customer's sen- sitive attribute and can reduce the information loss caused by the generalization to some degree.
关 键 词:匿名保护 数据概化 信息损失 聚类I-diversity
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.40