面向文档信息检索的排序学习算法  

Sorting Learning Algorithm for Document Information Retrieval

在线阅读下载全文

作  者:周祖坤[1] 杨光 冯小坤 

机构地区:[1]昆明冶金高等专科学校,云南昆明650221 [2]云南文化艺术职业学院,云南昆明650111 [3]云南大学滇池学院,云南昆明650228

出  处:《自动化技术与应用》2018年第2期40-45,共6页Techniques of Automation and Applications

摘  要:在基于排序学习的信息检索中,不同的查询及其待排序的文档序列之间有较大的差异性,传统的排序学习方法忽视了不同查询之间的差异性。另一方面,由于各个排序学习算法的偏好和侧重的不同,影响了在验证数据集中的排序性能。针对以上问题,本文提出了基于模型融合的有监督学习的多排序模型学习算法。此算法用每一个人工标注的查询-文档序列训练子模型以获得查询特征,并赋予每一个子模型不同的得分权重。用带系数的反三角函数优化定义的融合损失函数并使其连续且可导,通过多次迭代的梯度上升法训练出合适的子模型权重值和相关系数,综合各文档的得分和子模型的权重值为查询所对应的文档序列排序。最后本文通过在多个数据集下进行对比实验,证明了基于模型融合的有监督学习的多排序模型学习算法比传统排序学习算法有更好的性能。In the information retrieval based on learning to rank, there is a big difference between the different queries and sequences of documents that need to be ranked. Traditional models of learning to rank ignore the differences between different queries. On the other hand, the ranking performance on the verification data set is reduced as the preference and emphasis of each ranking algorithm are different. In view of the above problems, this paper proposes a rank aggregation framework with supervised learning. Firstly each sub-model is trained with each manually annotated query-document sequences and gives a scoring weight to obtain the query features. Then, the inverse trigonometric function with a coefficient to optimize the defined aggregation loss function is used to make it continuous and derivable. The appropriate values of coefficient and the weights are trained by the iterative method of gradient descent. And queries with corresponding document sequences take the scores and weights from each sub-model into account. By experiments, this paper proves that the rank aggregation framework with supervised learning has better performances than traditional models of learning to rank on multiple data sets.

关 键 词:排序学习 信息检索 查询差异 排序模型融合 损失函数 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象