检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:周倩伊 王亚民[1] 王闯 Zhou Qianyi ,Wang Yamin ,Wang Chuang(School of Economics and Management, Xidian University, Xi'an 710126, Chin)
机构地区:[1]西安电子科技大学经济与管理学院,西安710126
出 处:《数据分析与知识发现》2018年第2期58-63,共6页Data Analysis and Knowledge Discovery
摘 要:【目的】基于现有的脱敏技术,改进匿名组的划分效果,得到较优的脱敏模型及算法。【方法】基于k-匿名技术,改进维度划分标准,以KD树作为存储结构,构造新算法。利用Python实现程序,比较所产生的匿名组数量、NCP百分比,验证算法的可行性与有效性。【结果】新算法能够使得脱敏后整个数据集所生成的匿名组个数达到最大。且NCP百分比低于同类算法。【局限】对于有某一属性离散程度显著的数据集,循环计算划分维度较为繁琐。【结论】新算法相比于传统算法增加了匿名组个数,相比于同类算法,信息损失较低。[Objective] This paper aims to improve the classification results of anonymous groups and then obtain better data masking model and algorithm. [Methods] First, we modified the dimension judgment standards based on k-anonymity. Then, we used the KD tree as storage structure to construct a new algorithm. Third, we implemented the proposed algorithm with Python. Finally, we examined the feasibility and effectiveness of the new algorithm with the number of anonymous groups and the percentage of NCP. [Results] The new algorithm could maximize the number of anonymous groups generated by the whole dataset, while the percentage of NCP was lower than similar algorithms. [Limitations] For datasets with significant degree of dispersion, the dimension of the loop computation was cumbersome. [Conclusions] The proposed algorithm could improve the availability of the anonymous groups and reduce the data loss.
分 类 号:TP391[自动化与计算机技术—计算机应用技术] G35[自动化与计算机技术—计算机科学与技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.117.230.120