检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]医学影像计算教育部重点实验室(东北大学),沈阳110819 [2]东北大学信息科学与工程学院,沈阳110819
出 处:《计算机学报》2011年第11期2155-2164,共10页Chinese Journal of Computers
基 金:国家自然科学基金-国家杰出青年科学基金(61025007);国家自然科学基金重点项目(60933001)资助~~
摘 要:分布式环境中的top-k查询已经有了广泛的研究.由于仪器不精确和网络延时等原因,大多数分布式数据都存在不确定性.文中基于水平分布在P2P网络中的不确定数据提出了一个有效的top-k查询处理方法.首先利用Quad-tree构建一个分布式的不确定数据的索引,并基于索引提出了一个空间剪枝算法.然后,根据局部top-k概率与全局top-k概率之间的关系提出不确定数据成为top-k概率的上界,根据top-k概率与skyline概率之间的关系提出不确定数据成为top-k概率的下界,通过两种概率剪枝算法来减少top-k查询在网络中的传输和计算代价,并且进一步减少候选集大小.最后文中采用采样的方法来计算候选集的top-k概率以确定最终的top-k查询结果.大量的实验验证了算法的有效性.Top-k queries in distributed databases have been studied widely in recent years.There exists an inherent uncertainty on the data objects due to imprecise measurements and network delays.In this paper,based on horizontally distributed data among peers,we propose an efficient approach of processing uncertain top-k queries in P2P networks.Firstly,we construct a distributed index using Quad-tree,and based on the index,propose a spatial pruning algorithm.Secondly,we propose the upper bound of top-k probabilistic according to the relationship between local top-k probabilities and global top-k probabilities.We also propose the lower bound of top-k probabilities according to the relationship between skyline probabilities and top-k probabilities.Using the two probabilistic pruning algorithms,we can further reduce computation costs and network overhead of top-k queries,and further reduce the number of candidate sets.Finally,we develop a sampling algorithm to estimate top-k probabilities of candidates.Extensive experiments are conducted to verify the effectiveness and efficiency of the proposed methods.
关 键 词:TOP-K查询 skyline概率 P2P Quad-tree 不确定数据
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.118.151.13