检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:谭琛瀚 贾克斌 王浩宇 TAN Chen-Han;JIA Ke-Bin;WANG Hao-Yu(Faculty of Information Technology,Beijing University of Technology,Beijing 100124,China;Beijing Laboratory of Advanced Information Network,Beijing 100876,China)
机构地区:[1]北京工业大学信息学部,北京100124 [2]先进信息网络北京实验室,北京100876
出 处:《计算机系统应用》2025年第2期28-36,共9页Computer Systems & Applications
基 金:北京市自然科学基金(4212001)。
摘 要:自动文本摘要是自然语言处理(NLP)领域中的一个重要分支,其主要难点之一是在于如何快速、客观且准确地评估生成摘要的质量.针对现有文本摘要质量评估方法中评估准确度不高、需要参考文本以及计算资源消耗大的问题,本文提出一种基于大语言模型的文本摘要质量评估方法,设计基于思维链原理的提示词构建方法以提高大语言模型在文本摘要质量评估任务上的性能,同时生成思维链数据集并以模型微调的方式对小型大语言模型进行训练,显著降低了计算需求.本文方法首先根据文本摘要的特点确定评估维度,并基于思维链原理(chain of thought,CoT)构建提示词;使用提示词对大型大语言模型进行引导,使其根据摘要样本生成思维链过程与评估结果,同时以此为基础生成思维链数据集;使用生成的思维链数据集对小型大语言模型进行微调训练;最后使用微调后的小型大语言模型完成文本摘要的质量评估任务.本文在Summeval数据集上进行了对比实验与分析,实验结果表明,本评估方法显著提高了小型大语言模型在文本摘要质量评估任务上的评估准确度,实现了一种无需参考文本、评估准确度高、计算需求低、便于部署的文本摘要质量评估方法.Automatic text summarization is an important branch in the field of natural language processing(NLP),and one of its main difficulties lies in how to evaluate the quality of the generated summaries quickly,objectively,and accurately.Given the problems of low evaluation accuracy,the need for reference texts,and the large consumption of computing resources in the existing text summary quality evaluation methods,this study proposes an evaluation method for the quality of text summaries based on large language models.It designs a prompt construction method based on the principle of the chain of thought(CoT)to improve the performance of large language models in the evaluation of text summary quality.At the same time,a chain of thought data set is generated and a small large language model is trained in the way of model fine-tuning,significantly reducing the computing requirements.The proposed method first determines the evaluation dimension according to the characteristics of the text summary and constructs the prompt based on the principle of chain of thought.The prompt is utilized to guide the large language model to generate the chain of thought process and evaluation results based on the summary samples.Accordingly,a chain of thought data set is generated.The generated chain of thought data set is used to fine-tune and train the small large language model.Finally,the study uses the fine-tuned small-scale large language model to complete the quality evaluation of the text summary.Comparative experiments and analyses on the Summeval dataset show that this evaluation method significantly improves the evaluation accuracy of the small-scale large language model in the task of text summary quality evaluation.The study provides a text summary quality evaluation method,which is a method with high evaluation accuracy,low computing requirements,and easy deployment without reference texts.
关 键 词:文本摘要 质量评估 大语言模型 思维链 微调训练
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49