基于提示学习的中文文本分类方法探究  

Exploration on Chinese Text Classification Method Model Based on Prompt Learning

在线阅读下载全文

作  者:蔡飞[1] 宋城宇 王思远 李佩宏 林可菁 CAI Fei;SONG Chengyu;WANG Siyuan;LI Peihong;LIN Kejing(Science and Technology on Information Systems Engineering Laboratory,National University of Defense Technology,Changsha 410073,China;College of Information Resources Management,Renmin University of China,Beijing 100872,China)

机构地区:[1]国防科技大学信息系统工程重点实验室,长沙410073 [2]中国人民大学信息资源管理学院,北京100872

出  处:《火力与指挥控制》2023年第10期198-203,211,共7页Fire Control & Command Control

摘  要:自动文本分类方法是当前信息化和数据化时代处理非结构化信息的基本方法,是提升决策系统智能化程度的关键技术手段。近年来,由于其在小样本领域以及迁移学习方面的优异性能,基于提示学习的文本分类方法逐渐被广泛应用于一系列自然语言处理任务上。然而,当前对基于提示学习的方法仍集中于英文领域。英文和中文在语义、文法上的巨大差异性,基于提示学习的分类方法能否在中文任务中提高模型性能仍然有待探索。因此,使用中文基线数据集CLUE中多个分类任务对基于提示学习的文本分类方法性能进行实验验证。结果表明,基于提示学习的分类方法在多种分类任务上都表现出了较基线更强的性能,在不同的输入长度以及标签数量设置下也具有较强的鲁棒性。Automatic text classification methods are the basical methods for processing unstructured information in present informationization and datamation era.It is an essential technical means to enhance the intelligence degree of decision-making systems.In recent years,the text classification method based on prompt learning has been widely applied in series natural language processing services gradually for its super performance in the small sample field and transfer learning aspects.However,the current methods based on prompt learning are still focused on English field.Due to the semantic and grammatical differences between English and Chinese,whether prompt-based classification methods can improve model performance in Chinese tasks remains to be explored.Therefore,the performance of the prompt-based text classification method is validated by experiments on multi classification tasks in Chinese baseline dataset CLUE.The results show that the prompt-based classification method has stronger performance than the baseline in multi kinds of classification tasks based on prompt learning,and also has stronger robustness under different input lengths and quantities of label settings.

关 键 词:提示学习 文本分类 预训练语言模型 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象