检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:钟敏娟[1,2] 万常选[1,2] 刘德喜[1,2] 江腾蛟[1,2] 刘爱红[1,2] ZHONG Minjuan;WAN Changxuan;LIU Dexi;JIANG Tengjiao;LIU Aihong(School of Information Technology, Jiangxi University of Finance and Economics, Nanchang 330013, China;Jiangxi Key Laboratory of Data and Knowledge Engineering, Jiangxi University of Finance and Economics, Nanchang 330013, China)
机构地区:[1]江西财经大学信息管理学院,南昌330013 [2]江西财经大学数据与知识工程江西省高校重点实验室,南昌330013
出 处:《计算机科学与探索》2016年第12期1673-1682,共10页Journal of Frontiers of Computer Science and Technology
基 金:国家自然科学基金Nos.61363039;61363010;71361012;61562032;国家社会科学基金No.12CTQ042;江西省自然科学基金Nos.20142BAB217014;20142BAB207010;江西省高校人文社会科学研究规划基金项目No.TQ1504~~
摘 要:伪反馈(pseudo relevance feedback,PRF)一直以来都被认为是一种有效的查询扩展技术。然而传统的伪反馈容易带来主题漂移,从而影响检索性能。如何确定高质量的相关文档集,以及如何从相关文档集中挑选有用的扩展词项,是解决伪反馈中查询主题漂移的两个重要方面。对此,针对XML(extensible markup language)文档,提出了一个解决框架:一方面,研究了XML伪反馈文档查找方法,在充分考虑XML内容和结构特征的前提下,提出了基于检索结果聚类和两阶段排序模型相结合的高质量XML伪相关文档查找技术;另一方面,针对CO(content only)查询,对词项扩展进行了研究,提出了带结构语义的词项权值计算方法。一系列的相关实验数据表明,所提的XML伪反馈查询扩展方法能有效地减少查询主题漂移现象,获得更好的检索质量。Pseudo relevance feedback (PRF) has been perceived as an effective solution for automatic query expansion.However, traditional pseudo relevance feedback can result in the query representation“drifting”away from the original query and a decreased retrieval performance. Therefore, the key issues in applying PRF are to identify the real relevant documents in the top retrieved results without any other assistant information, and expend the query based on the these relevant documents. This paper presents a solution framework from extensible markup language (XML) data. Firstly, this paper considers the XML content and structure features, and proposes a good XML query scheme based on pseudo relevance feedback documents by combining search results clustering with a two- stage ranking model. Furthermore, this paper explores the XML query expansion of CO (content only) query, and gives the term weight computation with structure. The experimental results show that the proposed scheme can reduce the topic drift effectively and obtain the better retrieval quality.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.249