检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:牛新征[1] 王崇屹 叶志佳[1] 佘堃[2] Niu Xinzheng;Wang Chongyi;Ye Zhijia;She Kun(School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731;School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu 610054)
机构地区:[1]电子科技大学计算机科学与工程学院,成都611731 [2]电子科技大学信息与软件工程学院,成都610054
出 处:《计算机研究与发展》2017年第12期2785-2796,共12页Journal of Computer Research and Development
基 金:国家自然科学基金项目(61300192);国家科技支撑计划基金项目(2013BAH33F02);中央高校基本科研业务费专项资金项目(ZYGX2014J052);四川省科技支撑计划基金项目(2015GZ0096);成都市科学技术局软科学研究项目(2015-RK00-00046-ZF);四川省公安厅科研项目(2015SCYYCX06);四川省自贡市公安局项目~~
摘 要:关联规则隐藏是隐私保护数据挖掘(privacy-preserving data mining,PPDM)的一种重要方法.针对当前的关联规则隐藏算法直接操作事务数据、I/O开销较大的缺陷,提出一种基于FP-tree快速关联规则隐藏的算法FP-DSRRC.算法首先对FP-tree的结构进行改进,增设事务编号索引并建立双向遍历结构,进而利用改进的FP-tree对事务信息进行快速处理,避免了遍历原始数据集产生的大量I/O时间;然后通过建立和维护事务索引表实现对敏感项的快速查找,并基于分簇策略对关联规则处理,以簇为单位进行敏感规则消除,同时采用规则支持度和置信度阈值区间的思想,减少了关联规则隐藏处理对原始数据集的影响;最后通过实验测试证明:相较于传统关联规则隐藏算法,FP-DSRRC算法在保证生成的数据集质量的同时,减少了50%~70%的算法执行时间,并在大规模真实数据集上有较好的可用性.Association rules hiding is a very important method of privacy preserving data mining(PPDM).Because the current association rules hiding algorithm operates the transaction database directly,it leads to a lot of I/O overhead.To solve this problem,we put forward a quick association rules hiding algorithm based on FT tree,called FP DSRRC.Firstly,the algorithm improves the structure of FP tree by adding an index to the transaction number and establishing the bidirectional traverse structure.Then FP DSRRC uses the improved FP tree to quickly handle transaction data set,avoiding a large number of I/O overhead caused by traversal the raw transaction data set.Furthermore,FP DSRRC finds the sensitive items quickly by building and maintaining a transaction index table,and then handles the association rules based on the clustering strategy.We eliminate the sensitive rules by clusters,and reduce the negative influence caused by association rules hiding progress to the original data set by adopting the idea of rule support and confidence degree interval at the same time.Finally,the experiment shows that compared with traditional association rules hiding algorithm,the executive time of FP-DSRRC has been decreased by50%~70%while guaranteeing the quality of general data,moreover,FP-DSRRC has better availability on a large scale real data set.Key words
关 键 词:隐私保护 关联规则隐藏 频繁模式树 敏感规则 数据清洗
分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.249