P2P环境下面向不确定数据的Top-k查询  被引量:6

Answering Probabilistic Top-k Queries over P2P Networks

在线阅读下载全文

作  者:孙永佼[1,2] 袁野[1,2] 王国仁[1,2] 

机构地区:[1]医学影像计算教育部重点实验室(东北大学),沈阳110819 [2]东北大学信息科学与工程学院,沈阳110819

出  处:《计算机学报》2011年第11期2155-2164,共10页Chinese Journal of Computers

基  金:国家自然科学基金-国家杰出青年科学基金(61025007);国家自然科学基金重点项目(60933001)资助~~

摘  要:分布式环境中的top-k查询已经有了广泛的研究.由于仪器不精确和网络延时等原因,大多数分布式数据都存在不确定性.文中基于水平分布在P2P网络中的不确定数据提出了一个有效的top-k查询处理方法.首先利用Quad-tree构建一个分布式的不确定数据的索引,并基于索引提出了一个空间剪枝算法.然后,根据局部top-k概率与全局top-k概率之间的关系提出不确定数据成为top-k概率的上界,根据top-k概率与skyline概率之间的关系提出不确定数据成为top-k概率的下界,通过两种概率剪枝算法来减少top-k查询在网络中的传输和计算代价,并且进一步减少候选集大小.最后文中采用采样的方法来计算候选集的top-k概率以确定最终的top-k查询结果.大量的实验验证了算法的有效性.Top-k queries in distributed databases have been studied widely in recent years.There exists an inherent uncertainty on the data objects due to imprecise measurements and network delays.In this paper,based on horizontally distributed data among peers,we propose an efficient approach of processing uncertain top-k queries in P2P networks.Firstly,we construct a distributed index using Quad-tree,and based on the index,propose a spatial pruning algorithm.Secondly,we propose the upper bound of top-k probabilistic according to the relationship between local top-k probabilities and global top-k probabilities.We also propose the lower bound of top-k probabilities according to the relationship between skyline probabilities and top-k probabilities.Using the two probabilistic pruning algorithms,we can further reduce computation costs and network overhead of top-k queries,and further reduce the number of candidate sets.Finally,we develop a sampling algorithm to estimate top-k probabilities of candidates.Extensive experiments are conducted to verify the effectiveness and efficiency of the proposed methods.

关 键 词:TOP-K查询 skyline概率 P2P Quad-tree 不确定数据 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象