加密云数据下基于Simhash的模糊排序搜索方案  被引量:28

Ranked Fuzzy Keyword Search Based on Simhash over Encrypted Cloud Data

在线阅读下载全文

作  者:杨旸[1,2] 杨书略 柯闽 

机构地区:[1]福州大学数学与计算机科学学院,福州350108 [2]网络系统信息安全福建省高校重点实验室,福州350108 [3]福州大学物理与信息工程学院,福州350108

出  处:《计算机学报》2017年第2期431-444,共14页Chinese Journal of Computers

基  金:国家自然科学基金(61402112,61472307,61472309,61303198);福建省教育厅科技项目(JA12028);福建省重大科技项目(2015H6013);福州大学科技发展基金项目(2012-XY-17)资助

摘  要:为了保护数据隐私,数据拥有者会将敏感数据的密文外包到云服务器,这使得传统明文搜索技术难以使用.因此可搜索加密技术被用于对密文数据进行搜索,实现高效的数据利用.然而目前在加密云数据中,关键词模糊搜索方案主要是通过构造关键词模糊集合来实现,其需要大量的计算和存储开销.本文提出的搜索方案,无需构造关键词模糊集合,而是基于Simhash的降维思想,将文档关键词做n-gram处理并得到Simhash指纹来实现模糊搜索.该文结合汉明距离和关键词相关度分数,设计了双因子排序算法对查询结果进行排序.使用树索引结构和新型遍历方法进一步提高了搜索效率.通过新型遍历方法,即使树的节点值与期望值不相等,也能够对树进行遍历.理论分析和实验结果表明:该方案实现了加密云数据下的关键词模糊搜索,同时极大地节约了时间和空间成本.With the development of cloud computing, data owners are motivated to outsource their data and the corresponding complex management tasks to the public cloud for convenience and economic savings. In order to protect data privacy, data owners prefer to outsource their sensitive data in an encrypted form to the cloud, which makes the traditional search techniques useless. Searchable encryption is a technique to search on encrypted data without decryption to realize efficient data utilization. There have been some studies on secure searching over encrypted cloud data, which pay attention to both privacy and practicability of data. However, most of them are based on accurate keyword matching. The fuzzy keyword search problem remains unsolved. Up to date, the existing construction of fuzzy keyword search schemes has to build fuzzy keyword set. It will lead to tremendous computation and storage overheads. In this paper, we propose a new scheme without constructing fuzzy keyword set. Based on the idea of dimension reduction of Simhash, each keyword is transformed to a Simhash fingerprint by n -gram method to achieve fuzzy matching. Combining the hamming distance and keyword relevance score, we design a double factor ranking algorithm to sort the results accurately. In addition, tree structure and a novel traversal method are utilized to further improve the efficiency of our proposed scheme. The tree can be traversed even if the value of the tree node is not equal to the expected value by the proposed traversal method. Theoretical analysis and experimental results show that the scheme realizes the ranked fuzzy keyword search over encrypted cloud data. Meanwhile, the computation and storage overheads are greatly reduced. © 2017, Science Press. All right reserved.

关 键 词:云计算 加密云数据 隐私保护 可搜索加密 模糊排序搜索 Simhash 

分 类 号:TP309[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象