一种基于双重距离尺度的高维索引结构  被引量:3

Novel high-dimensional indexing structure based on dual-distance metric

在线阅读下载全文

作  者:庄毅[1] 翁建广[1] 庄越挺[1] 吴飞[1] 

机构地区:[1]浙江大学计算机科学与技术学院,浙江杭州310027

出  处:《浙江大学学报(工学版)》2007年第3期380-385,共6页Journal of Zhejiang University:Engineering Science

基  金:国家自然科学基金资助项目(60533090;60272031);国家杰出青年基金资助项目(60525108);中美百万册数字图书馆资助项目

摘  要:为了提高高维数据相似查询的效率,提出一种基于双重距离尺度(DDM)的新型高维索引结构.通过建模得到该DDM的四元组数据结构,对于高维空间中的数据点,通过k平均聚类算法将数据点聚成若干类,分别计算每个点对应的始点和质心距离,得到基于加权的质心距离,并将加权的质心距离作为每个数据点的索引键值,且用基于分片的B+树建立索引,得到了该索引的创建算法.高维空间的查询就转变成对一维空间的检索,并研究了数据点的维数、数据量和查询请求参数对查询性能的影响.结果表明,该DDM能更有效地缩小搜索空间,减少距离计算的开销,特别适合海量高维数据的查询.To speed up high-dimensional similarity search efficiency, a novel high-dimensi structure based on dual distance metric (DDM) was proposed. A four-tuple data structure of obtained after modelling. Every point in high-dimensional space was grouped into some clu means cluster algorithm, then the weighted centroid distance of every point was computed start distance and centroid distance of every point. The index key value of every point was partition-based B^+-tree, and the index construction algorithm was obtained. Queries in hig space were transformed into queries in single-dimensional space, and the effects of dimens onal indexing the DDM was stets using k-based on the inserted by a h-dimensional ionality, data size and query request parameter on query performance were investigated. The results show that DDM can effectively reduce search space and the distance computation cost. The index structure is particularly fit for querying large-scale high-dimensional data.

关 键 词:K近邻查询 类超球 质心距离 始点距离 

分 类 号:TP301[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象