不确定数据上两种查询的分布式聚集算法  被引量:11

Distributed Aggregations for Two Queries over Uncertain Data

在线阅读下载全文

作  者:周逊[1] 李建中[1] 石胜飞[1] 

机构地区:[1]哈尔滨工业大学计算机科学与技术学院,哈尔滨150001

出  处:《计算机研究与发展》2010年第5期762-771,共10页Journal of Computer Research and Development

基  金:国家自然科学基金项目(60773068;60703012;60773063);国家"九七三"重点基础研究发展计划基金项目(2006CB303000);国家自然科学基金重点项目(60533110);黑龙江省青年科技专项基金项目(QC06C033);NSFC/RGC联合科研基金项目(60831160525)~~

摘  要:不确定数据查询技术在军事、金融、电信等领域中起到了越来越重要的作用.不确定性数据在传感器网络、分布式Web Server及P2P系统等分布式系统中广泛存在.从这些系统中收集所有数据进行集中式查询将带来巨大的通信开销、时间延迟和存储代价.同时,由于不确定数据的特点,大多数集中式不确定查询算法在分布式环境下并不适用.给出不确定数据的最大值和Top-k聚集查询定义,并分别提出了基于过滤策略的分布式聚集算法.算法根据给出的3个过滤策略,利用数据的分布区间和概率进行筛选概率上限的计算,尽可能将不影响查询结果的数据抛弃.同时,算法以相对较小的代价归并保存并传输了计算最终查询结果所需要的"不可丢弃"数据.实验结果表明,在各类系统和数据条件下,过滤算法都能够正确地得到查询结果并显著降低系统的数据通信开销.The technique of querying uncertain data is playing an increasingly important role in the fields of military affairs,finance,and telecom.Actually,a lot of uncertain data is generated in distributed systems such as wireless sensor networks,distributed Web servers and P2P systems.Collecting all the uncertain data from such a system to perform centralized queries will lead to immense communication cost,time delay and storage cost.Also,most centralized query methods are not applicable to distributed systems due to the features of uncertain data.In this paper,distributed uncertain Max (Min) and Top-k queries are defined,and designed distributed aggregation algorithms are designed based on several filtering strategies.The algorithms compute the upper bound of the candidate probability for each data entry using the uncertain ranges and probability distributions of the data,and filter the data entries as much as possible,which have no chance to influence the query result.Also,the algorithms merge and keep a relatively small amount of necessary data in the transmission,to compute the final results.The correctness of the filtering strategies has been proved.Simulations show that the algorithms can process the queries correctly and reduce communication cost efficiently with various data and system circumstances.

关 键 词:不确定数据 分布式聚集 TOP-K查询 过滤策略 传感器网络 

分 类 号:TP393[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象