检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:贺广福 薛源海[1] 陈翠婷[1,2] 俞晓明 刘欣然[1,3] 程学旗 HE Guangfu;XUE Yuanhai;CHEN Cuiting;YU Xiaoming;LIU Xinran;CHENG Xueqi(Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190,China;University of Chinese Academy of Sciences,Beijing 101408,China;Beijing University of Posts and Telecommunications,Beijing 100876,China)
机构地区:[1]中国科学院计算技术研究所,北京100190 [2]中国科学院大学,北京101408 [3]北京邮电大学,北京100876
出 处:《大数据》2024年第1期17-34,共18页Big Data Research
基 金:国家自然科学基金项目(No.U21B2046)。
摘 要:近似最近邻搜索(ANNS)是计算机领域中一种重要的高效相似度搜索技术,可用于在大规模数据集中进行快速信息检索。随着人们对高精度信息检索的需求不断增长,同时使用结构化信息和非结构化信息进行混合查询的方式也得到了广泛应用。然而,基于近邻图的过滤贪心算法在混合查询时可能会因结构化约束条件的影响导致连通性降低,进而损害搜索精度。为此,提出了一种基于容忍因子的过滤贪心算法,通过容忍因子控制不满足结构化约束条件的顶点参与路由,在不改变索引结构的前提下维持原有近邻图的连通性,克服了结构化约束条件对检索精度的负面影响。实验结果证明,新算法可以在不同结构化约束强度下实现ANNS的高精度搜索,同时保持检索效率。该研究解决了基于近邻图的ANNS在混合查询场景中的问题,为大规模数据集的快速混合查询信息检索提供了一种有效的解决方案。Approximate nearest neighbor search(ANNS)is an important technique in the field of computer science for efficient similarity search,enabling fast information retrieval in large-scale datasets.With the increasing demand for high-precision information retrieval,there is a growing use of the hybrid query method that utilizes both structured and unstructured information for querying,which has broad application prospects.However,filtered greedy algorithms based on nearest neighbor graphs may reduce the connectivity of the graph due to the impact of structural constraints in hybrid queries,ultimately damaging search accuracy.This article proposes a tolerance factor based filtered greedy algorithm,which controls the participation of vertices that do not meet structural constraints in routing through a tolerance factor,maintaining the connectivity of the original nearest neighbor graph without changing the index structure,and overcoming the negative impact of structural constraints on retrieval accuracy.The experimental results demonstrate that the new method can achieve high-precision search for ANNS under different levels of structural constraints,while maintaining retrieval efficiency.This study solves the problem of ANNS based on nearest neighbor graphs in hybrid query scenarios,providing an effective solution for fast hybrid query information retrieval in large-scale datasets.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.224.153.49