检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]哈尔滨理工大学计算机科学与技术学院,哈尔滨150080 [2]长春大学软件学院,长春130022 [3]哈尔滨工业大学计算机科学与技术学院,哈尔滨150001
出 处:《计算机研究与发展》2010年第7期1234-1243,共10页Journal of Computer Research and Development
基 金:黑龙江省自然科学基金项目(F200601)
摘 要:KNN连接作为数据挖掘的基元,可以用来大幅度提高相似搜索、数据分析和数据挖掘的速度.到目前为止,对KNN连接的研究主要在基于磁盘系统的背景下进行,假设数据库太大以至于不能装入主存.随着RAM越来越大,价格也越来越低廉,这种假设逐渐受到挑战.因此,有必要重新对基于主存的KNN连接进行研究.在高效主存索引的基础上,采用编码解码、自底向上、深度优先遍历和剪枝等技术提出了一种新的KNN连接算法Δ-tree-KNN-Join.该算法解决了KNN连接中确定搜索半径困难的问题,提高了连接效率.在真实数据和合成聚类数据上进行了实验,结果显示Δ-tree-KNN-Join是一种有效的主存KNN连接算法.KNN-Join is an important database primitive, which has been successfully applied to speed up applications such as similarity search, data analysis and data mining. Up to now, the KNN-Join has been studied in the context of disk-based systems where it is assumed that the databases are too large to fit into the main memory. This assumption is increasingly being challenged as RAM gets cheaper and larger. Therefore, it is necessary to study the problem of KNN-Join in the main memory. △-tree is a novel multi-level index structure that can speed up the high-dimensional query in the main memory environment. In this paper, a new KNN-Join approach is proposed, which uses △-tree as the underlying index structure, exploits coding and decoding, bottom up, depth first search and pruning technique. The problem that it is hard to identify the distance from each point in R to its K nearest neighbors in S is solved and the Join efficiency is improved. The correctness verification and cost analysis of the above mentioned algorithm are presented. Extensive experiments on both real datasets and synthetic clustered datasets are conducted, and the results illustrate that △-tree-KNN-Join is an effieient KNN-Join method in main memory.
分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.36