检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]国防科技大学信息系统工程重点实验室,湖南长沙410073
出 处:《国防科技大学学报》2013年第2期132-136,共5页Journal of National University of Defense Technology
基 金:国家自然科学基金资助项目(61070216)
摘 要:在信息检索中,系统需要根据用户查询将文档按照相似度大小进行排序,吸引了众多信息检索和机器学习领域研究者的眼球,并形成了诸多排序算法模型。然而并未考虑到查询短语与文档构成的特征对与用户相关反馈之间存在的同质性。在机器学习算法基础上,通过提取训练样本的主要特征进行有效聚类,并结合用户的相关反馈获取各个类中相关度判断的置信值,形成相似度判定模型,应用该模型来对测试样本进行相关度排序。算法对LETOR数据集进行了测试,实验表明,信息检索性能指标比其他排序算法有了进一步提高,并且无需复杂的数据预处理工作和手动设定算法参数。Many information retrieval applications have to present their results in the form of ranked lists, in which documents must be sorted in a descending order according to their relevance to a given query. This has led the interest of the information retrieval community in methods that automatically learn effective ranking models, and recently machine learning techniques have also been applied to model construction. Most of the existing methods do not take into consideration the fact that significant homogeneity exists between query-document pairs related to user' s feedback. In this research, a novel method which clusters patterns in the training data with their relevance from the user, and then uses the discovered rules to rank documents at query-time. A systematic evaluation of the proposed method using the LETOR benchmark dataset is posposed. The experimental results show that the proposed method outperforms the state-of-the-art methods with no need of time-consuming and laborious preprocessing.
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.221.87.167