检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:何伟东 杨志豪[1] 王治政 林鸿飞[1] HE Weidong;YANG Zhihao;WANG Zhizheng;LIN Hongfei(School of Computer Science and Technology,Dalian University of Technology,Dalian 116024,China)
机构地区:[1]大连理工大学计算机科学与技术学院,辽宁大连116024
出 处:《山西大学学报(自然科学版)》2023年第3期491-499,共9页Journal of Shanxi University(Natural Science Edition)
基 金:国家自然科学基金(62276043);中央高校基本科研业务费资助(DUT22ZD205)。
摘 要:近年来,基于深层语义信息表征的pointwise重排序策略存在忽略被检索文档之间的偏序关系的问题,并且,患者病例查询的内容表征也需要满足生物医学领域的特定需求。针对以上问题,本文提出了一种基于生物医学预训练语言模型(BioBERT)的偏序文档检索方法,该方法基于BM25召回文档,对待排序文档依次使用pointwise与pairwise提取特征,其中,pointwise方法能够获取待排序文档的全局位置特征,而引入查询特征的pairwise方法可以学习待排序文档之间的相对偏序关系。在TREC 2019 Precision Medicine Track数据集上的实验表明,该方法在p@10指标中,相比于最优的基准方法提升了3.3%。Recent years,it has been found that the pointwise reranking strategy based on deep semantic information representation has the problem of ignoring the partial order relationship between retrieved documents.Moreover,the content representation of patient case queries also needs to meet the specific needs of the biomedical field.In view of the above problems,we propose a partial order document retrieval method based on the biomedical pre-trained language model(BioBERT).This method recalls documents based on BM25,and sequentially uses pointwise and pairwise to extract features from documents to be sorted.Among them,the pointwise method can obtain the global location feature of the documents to be sorted,and the pairwise method that introduces the query feature can learn the relative partial order relationship between the documents.Experiments on the TREC 2019 Precision Medicine Track dataset show that this method improves the p@10 metric by 3.3%compared to the best baseline method.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49