检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李崭
出 处:《人工智能与机器人研究》2025年第2期304-312,共9页Artificial Intelligence and Robotics Research
基 金:国家自然科学基金资助项目(项目编号:62406263)。
摘 要:得益于大语言模型的技术发展,自然语言处理领域的知识蒸馏范式发生了颠覆式的改变。大模型提示知识获取方式,使得知识蒸馏的方式向着更通用的知识获取以及数据增强的方式发展。面向大模型时代知识提炼模式转变以及小模型小样本学习挑战,本文提出了一种基于弱点增强的LLM知识蒸馏算法,结合LLM的语义理解以及文本生成能力,实现小样本情况下的学生模型弱点分析,通过LLM教师模型针对学生模型弱点进行增强样本构建迭代训练强化,增强学生模型的能力。通过在多种自然语言处理实验结果表明,本文提出的方法在少样本标注需求下,通过知识蒸馏,可以大幅度提升模型的训练效果,充分证明了方法的有效性。Thanks to the technological advancements in large language models, the knowledge distillation paradigm in the field of natural language processing has undergone a revolutionary change. The knowledge acquisition method prompted by LLM has led to a shift in knowledge distillation towards more universal knowledge acquisition and data augmentation approaches. In response to the transformation of knowledge extraction patterns in the era of LLM and the challenges of small-model, few-sample learning, this paper proposes a knowledge distillation algorithm for LLM based on weakness enhancement. By leveraging the semantic understanding and text generation capabilities of LLM, this algorithm enables the analysis of student model weaknesses under few-sample conditions. The LLM teacher model enhanced samples to train and strengthen the student model to enhance student model’s capabilities. Experimental results in various NLP tasks demonstrate that the proposed method, under the requirement of few labeled samples, can significantly improve the training effectiveness of models through knowledge distillation, fully proving the effectiveness of the method.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.147