基于提示学习的小样本文本分类方法被引量：4

Few-shot text classification method based on prompt learning

作　　者：于碧辉蔡兴业[1,2] 魏靖烜 YU Bihui;CAI Xingye;WEI Jingxuan(University of Chinese Academy of Sciences,Beijing 100049,China;Shenyang Institute of Computing Technology,Chinese Academy of Sciences,Shenyang Liaoning 110168,China)

机构地区：[1]中国科学院大学,北京100049 [2]中国科学院沈阳计算技术研究所,沈阳110168

出　　处：《计算机应用》2023年第9期2735-2740,共6页journal of Computer Applications

基　　金：国家重点研发计划项目(2019YFB1405803)。

摘　　要：文本分类任务通常依赖足量的标注数据,针对低资源场景下的分类模型在小样本上的过拟合问题,提出一种基于提示学习的小样本文本分类方法BERT-P-Tuning。首先,利用预训练模型BERT(Bidirectional Encoder Representations from Transformers)在标注样本上学习到最优的提示模板;然后,在每条样本中补充提示模板和空缺,将文本分类任务转化为完形填空任务;最后,通过预测空缺位置概率最高的词并结合它与标签之间的映射关系得到最终的标签。在公开数据集FewCLUE上的短文本分类任务上进行实验,实验结果表明,所提方法相较于基于BERT微调的方法在评价指标上有显著提高。所提方法在二分类任务上的准确率与F1值分别提升了25.2和26.7个百分点,在多分类任务上的准确率与F1值分别提升了6.6和8.0个百分点。相较于手动构建模板的PET(Pattern Exploiting Training)方法,所提方法在两个任务上的准确率分别提升了2.9和2.8个百分点,F1值分别提升了4.4和4.2个百分点,验证了预训练模型应用在小样本任务的有效性。Text classification tasks usually rely on sufficient labeled data.Concerning the over-fitting problem of classification models on samples with small size in low resource scenarios,a few-shot text classification method based on prompt learning called BERT-P-Tuning was proposed.Firstly,the pre-trained model BERT(Bidirectional Encoder Representations from Transformers)was used to learn the optimal prompt template from labeled samples.Then,the prompt template and vacancy were filled in each sample,and the text classification task was transformed into the cloze test task.Finally,the final labels were obtained by predicting the word with the highest probability of the vacant positions and combining the mapping relationship between it and labels.Experimental results on the short text classification tasks of public dataset FewCLUE show that the proposed method have significantly improved the evaluation indicators compared to the BERT fine-tuning based method.In specific,the proposed method has the accuracy and F1 score increased by 25.2 and 26.7 percentage points respectively on the binary classification task,and the proposed method has the accuracy and F1 score increased by 6.6 and 8.0 percentage points respectively on the multi-class classification task.Compared with the PET(Pattern Exploiting Training)method of constructing templates manually,the proposed method has the accuracy increased by 2.9 and 2.8 percentage points respectively on two tasks,and the F1 score increased by 4.4 and 4.2 percentage points respectively on two tasks.The above verifies the effectiveness of applying pre-trained model on few-shot tasks.

关键词：小样本学习文本分类预训练模型提示学习自适应模板

分类号：TP391.1[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于提示学习的小样本文本分类方法被引量：4

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于提示学习的小样本文本分类方法 被引量：4

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于提示学习的小样本文本分类方法被引量：4