检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:Chaofeng SHA Keqiang WANG Dell ZHANG Xiaoling WANG Aoying ZHOU
机构地区:[1]School of Computer Science, Shanghai Key Laboratory of Intelligent Information Processing, Fudan University, Shanghai 200433, China [2]Shanghai Key Laboratory of Trustworthy Computing, East China Normal University, Shanghai 200062, China [3]Department of Computer Science and Information Systems, Birkbeck, University of London, London WCIE 7HX, UK
出 处:《Frontiers of Computer Science》2016年第3期477-487,共11页中国计算机科学前沿(英文版)
基 金:This work was supported by the National Natural Science Foundation of China (Grant Nos. 61572135 and 61170085), 973 project (2010CB328106), Program for New Century Excellent Talents in China (NCET-10-0388).
摘 要:The key issue in top-k retrieval, finding a set of k documents (from a large document collection) that can best answer a user's query, is to strike the optimal balance between relevance and diversity. In this paper, we study the top-k re- trieval problem in the framework of facility location analysis and prove he submodularity of that objective function which provides a theoretical approximation guarantee of factor 1 -1/ε for the (best-first) greedy search algorithm. Furthermore, we propose a two-stage hybrid search strategy which first ob- tains a high-quality initial set of top-k documents via greedy search, and then refines that result set iteratively via local search. Experiments on two large TREC benchmark datasets show that our two-stage hybrid search strategy approach can supersede the existing ones effectively and efficiently.The key issue in top-k retrieval, finding a set of k documents (from a large document collection) that can best answer a user's query, is to strike the optimal balance between relevance and diversity. In this paper, we study the top-k re- trieval problem in the framework of facility location analysis and prove he submodularity of that objective function which provides a theoretical approximation guarantee of factor 1 -1/ε for the (best-first) greedy search algorithm. Furthermore, we propose a two-stage hybrid search strategy which first ob- tains a high-quality initial set of top-k documents via greedy search, and then refines that result set iteratively via local search. Experiments on two large TREC benchmark datasets show that our two-stage hybrid search strategy approach can supersede the existing ones effectively and efficiently.
关 键 词:top-k retrieval DIVERSIFICATION submodular function maximization
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.223.188.252