融合对比学习和BERT的层级多标签文本分类模型  

Hierarchical multi-label text classification model based on contrastive learning and BERT

在线阅读下载全文

作  者:代林林 张超群[1,2] 汤卫东[1] 刘成星 张龙昊 DAI Lin-lin;ZHANG Chao-qun;TANG Wei-dong;LIU Cheng-xing;ZHANG Long-hao(College of Artificial Intelligence,Guangxi Minzu University,Nanning 530006,China;Guangxi Key Laboratory of Hybrid Computation and IC Design Analysis,Guangxi Minzu University,Nanning 530006,China)

机构地区:[1]广西民族大学人工智能学院,广西南宁530006 [2]广西民族大学广西混杂计算与集成电路设计分析重点实验室,广西南宁530006

出  处:《计算机工程与设计》2024年第10期3111-3119,共9页Computer Engineering and Design

基  金:国家自然科学基金项目(62062011);广西自然科学基金项目(2019GXNSFAA185017);广西民族大学研究生教育创新计划基金项目(gxun-chxs2022094)。

摘  要:为有效解决现有文本分类模型难以建模标签语义关系的问题,提出一种融合对比学习和自注意力机制的层级多标签文本分类模型,命名为SampleHCT。设计一个标签特征提取模块,能有效提取标签的语义和层次结构特征。采用自注意力机制构建具有混合标签信息的阳性样本。使用对比学习训练文本编码器的标签意识。实验结果表明,SampleHCT相较于19个基准模型,取得了更高的分类分数,验证了其具有更有效的标签信息建模方式。To effectively address the problem that existing text classification models are unable to model semantic relationships among labels,a hierarchical multi-label text classification model named SampleHCT was put forward,in which contrastive lear-ning and self-attention mechanism were combined.A label features extraction module was designed using SampleHCT to extract both semantic and hierarchical structural features of labels.The self-attention mechanism was adopted to construct positive samples with mixed label information.The contrastive learning was established to train the label-awareness of the text encoder.Experimental results demonstrate that SampleHCT achieves higher classification scores compared to 19 benchmark models,which verifies that SampleHCT has a more effective label information modeling method.

关 键 词:文本分类 对比学习 自注意力机制 层级结构 多标签 标签信息 全局特征 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象