BC-iDistance:an optimized high-dimensional index for KNN processing  

BC-iDistance:an optimized high-dimensional index for KNN processing

在线阅读下载全文

作  者:梁俊杰 冯玉才 

机构地区:[1]Faculty of Mathematics and Computer Science,Hubei University [2]College of Computer Science and Technology,Huazhong University of Science and Technology

出  处:《Journal of Harbin Institute of Technology(New Series)》2008年第6期856-861,共6页哈尔滨工业大学学报(英文版)

基  金:Sponsored by the National High Technology Research and Development Program of China (863 Program)(Grant No.[2005]555)

摘  要:To facilitate high-dimensional KNN queries,based on techniques of approximate vector presentation and one-dimensional transformation,an optimal index is proposed,namely Bit-Code based iDistance(BC-iDistance).To overcome the defect of much information loss for iDistance in one-dimensional transformation,the BC-iDistance adopts a novel representation of compressing a d-dimensional vector into a two-dimensional vector,and employs the concepts of bit code and one-dimensional distance to reflect the location and similarity of the data point relative to the corresponding reference point respectively.By employing the classical B+tree,this representation realizes a two-level pruning process and facilitates the use of a single index structure to further speed up the processing.Experimental evaluations using synthetic data and real data demonstrate that the BC-iDistance outperforms the iDistance and sequential scan for KNN search in high-dimensional spaces.To facilitate high-dimensional KNN queries, based on techniques of approximate vector presentation and one-dimensional transformation, an optimal index is proposed, namely Bit-Code based iDistance ( BC-iDistance). To overcome the defect of much information loss for iDistance in one-dimensional transformation, the BC-iDistance adopts a novel representation of compressing a d-dimensional vector into a two-dimensional vector, and employs the concepts of bit code and one-dimensional distance to reflect the location and similarity of the data point relative to the corresponding reference point respectively. By employing the classical B + tree, this representation realizes a two-level pruning process and facilitates the use of a single index structure to further speed up the processing. Experimental evaluations using synthetic data and real data demonstrate that the BC- iDistance outperforms the iDistance and sequential scan for KNN search in high-dimensional spaces.

关 键 词:high-dimensional index KNN search bit code approximate vector 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象