检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:廖列法[1,2] 姜炫至 LIAO Liefa;JIANG Xuanzhi(School of Information Engineering,Jiangxi University of Science and Technology,Nanchang 341000,China)
机构地区:[1]江西理工大学信息工程学院,南昌341000 [2]江西现代职业技术学院,南昌330095
出 处:《计算机工程与应用》2025年第8期117-125,共9页Computer Engineering and Applications
摘 要:为解决医学文本使用预训练模型处理速度慢、对硬件的算力要求高、难以分辨少样本类别,以及使用传统小模型由于自身限制难以做到足够的准确度的问题,提出了一种融合预训练和元蒸馏的文本分类模型PTMD(fusion of pre-training and meta distillation model)。PTMD针对医学文本的多标签问题,通过对比训练对RoBERTa预训练方法进行微调,再由双向内置注意力简单循环单元充分获取语义信息。最后在传统蒸馏模型的基础上融合元学习和助教的思想,通过教学实验和双级模型等方法,提高模型的教学水平,最终在减少训练成本的基础上得到一个高性能医学文本分类模型。实验结果表明,教师模型在CHIP2019评测三数据集上的F1值达到了85.47%,同时学生模型在F1值损失1.45个百分点的情况下,将模型规模缩小到教师模型的近1/6,效果高于大多数传统预训练模型和知识蒸馏模型,证明了该模型具有良好的实用价值。In order to solve the problems of slow classification processing speed of pre-trained models,high computational power requirements for hardware,the difficulty in distinguishing small sample categories,and the inability of traditional small models to achieve sufficient accuracy due to their own limitations,a text classification model named PTMD(fusion of pre-training and meta distillation model)that integrates pre-training and meta distillation is proposed.PTMD focuses on the multi label problem of medical texts,fine-tunes the RoBERTA pre-training method through comparative training,and fully obtains semantic information through bidirectional built-in attention simple loop units.Finally,based on the traditional distillation model,the idea of meta learning is integrated.Through teaching experiments and other methods,the teaching level of the model is improved,knowledge transfer is completed,and a high-performance medical text classification model is obtained while reducing training costs.The experimental results show that the F1 value of the teacher model on the CHIP2019 evaluation dataset reaches 85.47%,while DistillBiLSTM reduces the model size to nearly 1/6 of the teacher model with a loss of 1.45 percentage points in the F1 value,and its performance is better than that of most traditional pre-training models and knowledge distillation models,demonstrating that this model has good practical value.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.33