检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:刘颖[1] 陈岭[1] 陈根才[1] 赵江奇 王敬昌
机构地区:[1]浙江大学计算机科学与技术学院,浙江杭州310027 [2]浙江鸿程计算机系统有限公司,浙江杭州310009
出 处:《浙江大学学报(工学版)》2013年第1期23-28,161,共7页Journal of Zhejiang University:Engineering Science
基 金:国家"核高基"重大科技专项课题资助项目(2010ZX01042-002-003);国家自然科学基金资助项目(60703040);浙江省科技计划重大资助项目(2007C13019);浙江省重大科技专项资助项目(2011C13042);杭州市重大科技创新专项资助项目(20112311A20)
摘 要:针对分布式信息检索时不同信息集对最终检索结果贡献度有差异的现象,提出基于历史点击数据的集合选择方法(PCTD-CS).该方法利用点击数据估计各集合与历史查询的相关度.采用基于关键词和基于检索结果相结合的方法估计查询间的相似度.利用历史查询中的相似查询估计新查询与各集合的相关度,选择相关度最高的M个集合进行检索,给出要获取前k个文档的情况下各集合应当返回的文档数.采用召回率Rm、前n个检索结果的准确率P@n及平均准确率MAP对集合选择方法的性能进行验证.实验结果表明,采用PCTD-CS方法提高了检索结果的召回率和准确率,能够更准确地定位到包含相关文档多的集合.An approach of collection selection based on click-through data (PCTD-CS) was proposed con- sidering that collections have different contributions to the final retrieval results. Click-through data of past queries were utilized for estimating the relevance of each collection to the query. A term-based and re- sults-based mixed approach was used to estimate the similarity between queries. Past similar queries were used to predict the relevance of collections to a specific user query. Then M collections with the highest relevance were selected for retrieving, and the number of documents each collection returned was deter- mined when top k ranked results were required. Rm, P@n and MAP were used to verify the effectiveness of the new collection selection method. Experimental results demonstrated that PCTD-CS improved the accuracy and recall of search results. PCTD-CS was better at selecting collections with more relevant documents.
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49