基于大语言模型的论文检索与分析方法研究被引量：2

Research on Retrieval and Analysis Methods of Papers Based onLarge Language Models

作　　者：解勉陈刚余晓晗 XIE Mian;CHEN Gang;YU Xiao-han(School of Command and Control Engineering,Army Engineering University of PLA,Nanjing 210007,China)

机构地区：[1]中国人民解放军陆军工程大学指挥控制工程学院,江苏南京210007

出　　处：《计算机技术与发展》2024年第12期116-124,共9页Computer Technology and Development

摘　　要：在现代学术研究中,高效准确地检索相关学术论文是至关重要的一环。传统的检索方法通常依赖于精确的关键词输入,要求用户具备一定程度的专业知识以选择和使用恰当的术语。针对这一问题,探索一种利用大语言模型(Large Language Models, LLMs)基于内容对论文进行检索与分析的方法,旨在降低检索词专业性带来的论文检索门槛,同时可以对论文内容进行一定的分析。首先,提出了基于内容的论文检索与分析设计框架,以论文解析和向量数据库为基础分别针对单篇论文、多篇论文以及较模糊的通俗描述进行检索与分析;其次,设计了论文解析方法,以及用于提取论文主要内容的大语言模型提示词,引导大语言模型更关注论文具有代表性的关键信息,从而提高检索性能,并通过对比分析获得了更有效提取信息的提示词;最后,通过对比实验证明了该方法的可行性与有效性,根据论文全文以及较模糊的通俗描述进行检索,mAP分别达98.47%和99.51%。Efficient and accurate retrieval of relevant academic papers is crucial in modern academic research.Traditional retrieval methods often rely on precise keyword input,requiring users to have a certain level of professional knowledge to choose and use appropriate terminology.To address this issue,we explore a method of using Large Language Models for content-based retrieval and analysis of papers,aiming to reduce the retrieval threshold caused by the professionalism of search terms,while also allowing for certain analysis of paper content.Firstly,a content based paper retrieval and analysis design framework was proposed,which is based on paper parsing and vector databases for searching and analyzing single papers,multiple papers,and vague popular descriptions.Secondly,a paper parsing method was designed,as well as a large language model prompt word for extracting the main content of the paper,guiding the large language model to pay more attention to the representative key information of the paper,thereby improving retrieval performance.And through comparative analysis,more effective prompt words for extracting information were obtained.Finally,the feasibility and effectiveness of the proposed method were demonstrated through experiments.Based on the full text of the paper and vague popular descriptions,the mAP for retrieval reached 98.47%and 99.51%,respectively.

关键词：文档检索文档分析大语言模型提示词工程学术论文

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于大语言模型的论文检索与分析方法研究被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于大语言模型的论文检索与分析方法研究 被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于大语言模型的论文检索与分析方法研究被引量：2