匿名数据集隐私保护效果度量机制  被引量:1

Measurement the effect of anonymization techniques over databases

在线阅读下载全文

作  者:臧帅 朱友文 ZANG Shuai;ZHU Youwen(School of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 210016,China)

机构地区:[1]南京航空航天大学计算机科学与技术学院,南京210016

出  处:《网络空间安全科学学报》2024年第3期67-78,共12页Journal of Cybersecurity

基  金:国家重点研发计划项目(2021YFB3100400)。

摘  要:当前,数据拥有者通常需要将自己收集到的数据交予其他机构进行数据分析或向公众发布。为了防止用户隐私信息的泄露,在发布或共享数据前,往往需要对数据进行匿名处理,达到一定隐私保护程度后才可安全发布。因此衡量发布数据的隐私保护水平是一项重要的研究内容。由于在以往的研究中,缺少足够通用的方案,不能对发布数据的隐私保护水平进行精确度量。因此提出了一种度量发布数据隐私保护程度方法,该方法主要通过条件熵与互信息,度量出数据处理前后的差异值,在此基础上基于互信息和联合熵融合得到具体的隐私保护效果,最终输出一个0~1范围的数值精确表示发布数据的隐私保护水平。将该方法应用到真实的数据集中,在匿名处理数据集使其满足常用的隐私模型后,分别度量不同隐私模型下数据各个属性的隐私保护水平,证明了所提方法的有效性。Nowadays,data owners often need to provide the data they have collected to other organizations for data analysis.To prevent the leakage of users'private information,data is typically anonymized before being published or shared,ensuring a certain level of privacy protection.Therefore,measuring the privacy protection level of published data is an important research topic.In previous studies,there has been a lack of sufficiently general methods to accurately measure the privacy protection level of published data.A method to measure the privacy protection level of published data was proposed.The method primarily uses conditional entropy and mutual information to measure the difference between the data before and after processing.The results are then substituted into a formula to obtain an accurate privacy protection level,ultimately calculating a number between 0 and 1 to precisely indicate the privacy protection level of the data.Finally,this method was applied to real datasets.After anonymizing the datasets to meet commonly used privacy models,the privacy protection level of each attribute was measured under different privacy models,thereby demonstrating the practicality of the method.

关 键 词:匿名 隐私 信息熵 条件熵 互信息 

分 类 号:TP309[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象