利用主题标引进行查询重排序  

Re-rank Retrieval Results Through Subject Indexing

在线阅读下载全文

作  者:毛进[1] 李纲[1] 操玉杰[2] 

机构地区:[1]武汉大学信息资源研究中心,武汉430072 [2]网易(杭州)网络有限公司,杭州310052

出  处:《现代图书情报技术》2014年第7期48-55,共8页New Technology of Library and Information Service

基  金:国家社会科学基金重大项目"智慧城市应急决策情报体系建设研究"(项目编号:13&ZD173)的研究成果之一

摘  要:【目的】在伪相关反馈过程中,利用主题标引对查询结果进行重排序。【方法】借助语言模型方法,挖掘主题词与用户查询关系,将用户查询表达为主题词的概率分布,并建立主题词语言模型,进而判断主题词在文档中的权重。在此基础上,重新计算初次查询结果文档分值,进行查询重排序。【结果】本文方法能够较好地为主题词建立语言模型表示,挖掘得到主题词在文档中的权重,重排序结果相较于初次检索具有普遍性能提升。【局限】未比较挖掘主题词与文档关系的不同方法;未在不同规模、不同语言数据集中实验。【结论】挖掘主题词与用户查询关系、主题词与文档关系,进行查询重排序,能够提升查询精确度。[Objective] This paper tries to re-rank search results with the help of subject indexing in the process of pseudo feedback. [Methods] User queries are represented with probability distributions over subject terms by mining the user query and subject term association in the manner of language modeling. The weights of subject terms in documents are calculated by incorporating the generative language models for subject terms. Then re-calculate the score of search documents in the first retrieval and re-rank the documents according to their scores. [Results] The proposed method constructs the generative langauge models for subject terms and mines weights of subject terms in documents appropriately. The re-rank results are pervasively improved over the initial retieval. [Limitations] Different methods of mining the associations between subject terms and documents are not compared. This approach doesn't test the datasets with different scales or in different languages. [Conclusions] The re-rank approach can improve the retrieval precision, which exploits the associations between user queries, documents and subject terms.

关 键 词:语言模型 信息检索 主题词 主题标引 查询重排序 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象