大模型驱动的学术文本挖掘--推理端指令策略构建及能力评测  被引量:4

Large Language Model-Driven Academic Text Mining:Construction and Evaluation of Inference-End Prompting Strategy

在线阅读下载全文

作  者:陆伟[1,2] 刘寅鹏 石湘 刘家伟 程齐凯[1,2] 黄永[1,2] 汪磊 Lu Wei;Liu Yinpeng;Shi Xiang;Liu Jiawei;Cheng Qikai;Huang Yong;Wang Lei(School of Information Management,Wuhan University,Wuhan 430072;Information Retrieval and Knowledge Mining Laboratory,Wuhan University,Wuhan 430072)

机构地区:[1]武汉大学信息管理学院,武汉430072 [2]武汉大学信息检索与知识挖掘研究所,武汉430072

出  处:《情报学报》2024年第8期946-959,共14页Journal of the China Society for Scientific and Technical Information

基  金:国家自然科学基金重点项目“数智赋能的科技信息资源与知识管理理论变革”(72234005);国家自然科学基金面上项目“基于机器阅读理解的科学命题文本论证逻辑识别”(72174157)。

摘  要:大型语言模型突出的任务理解和指令遵循能力,使用户可以通过简单的指令交互完成复杂的信息处理任务。科技文献分析领域正在积极探索大模型的应用,但尚未形成对指令工程技术和模型能力边界的系统性研究。本文以学术文本挖掘任务为切入点,从上下文学习、思维链推理等角度设计推理端指令策略,构建了涵盖文本分类、信息抽取、文本推理和文本生成4个能力维度共6项任务的大模型学术文本挖掘专业能力评测框架,并选取了7个国内外主流的指令调优模型进行实验,对比了不同指令策略的适用范围和不同参数模型的专业能力。实验结果表明,少样本、思维链等复杂指令策略在分类任务上的应用效果并不显著,而在抽取、生成等难度较高的任务上表现良好。千亿级参数规模的大模型经过指令引导,能够取得与充分训练的深度学习模型相近的效果,但对于十亿级或百亿级规模大模型,推理端的指令策略存在明显上限。为了实现大模型向科技情报领域的深层嵌入,现阶段仍需在调优端对模型参数进行领域化适配。Task comprehension and instruction-following abilities of large language models enable users to complete complex information-processing tasks through simple interactive instructions.Scientific literature analysts are actively exploring the application of large language models;however,a systematic study of the capability boundaries of large models has not yet been conducted.Focusing on academic text mining,this study designs inference-end prompting strategies and establishes a comprehensive evaluation framework for large language model-driven academic text mining,encom‐passing text classification,information extraction,text reasoning,and text generation,covering six tasks in total.Main‐stream instruction-tuned models were selected for the experiments,to compare the different prompting strategies and pro‐fessional capabilities of the models.The experiments indicate that complex instruction strategies,such as few-shot and chain-of-thought,are not effective in classification tasks,but perform well in more challenging tasks,such as extraction and generation,whereby trillion-parameter scale models achieve results comparable to those of fully trained deep-learn‐ing models.However,for models with billions or tens of billions of parameter scales,there is a clear upper limit to infer‐ence-end instruction strategies.Achieving deep integration of large language models into the field of scientific intelli‐gence requires adaption of the model to the domain at the tuning end.

关 键 词:大模型 学术文本挖掘 指令工程 能力评测 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象