基于BERT-GAT-CorNet多标签中文短文本分类方法  被引量:2

Multi-label Chinese short text classification method based on BERT-GAT-CorNet

在线阅读下载全文

作  者:刘新忠 赵澳庆 谢文武 杨志和 LIU Xinzhong;ZHAO Aoqing;XIE Wenwu;YANG Zhihe(College of Information Science and Engineering,Hunan Institute of Science and Technology,Yueyang Hunan 414000,China)

机构地区:[1]湖南理工学院信息科学与工程学院,湖南岳阳414000

出  处:《计算机应用》2023年第S02期18-21,共4页journal of Computer Applications

基  金:湖南省自然科学基金资助项目(2023JJ50045,2023JJ50046)。

摘  要:多标签文本分类问题是多标签分类的一个重要内容,传统的多标签文本分类算法往往只关注文本本身的信息而无法理解深层语义信息,也未考虑标签之间的关系。为了解决这些问题,提出了融合BERT(Bidirectional Encoder Representation from Transformers)-GAT(Graph Attention neTwork)-CorNet(Correlation Network)的多标签文本分类模型。首先,通过预训练模型BERT表示文本的特征向量,并用生成的特征向量建立图结构数据;接着,用GAT来为不同节点分配不同的权重;最后,通过Softmax-CorNet学习标签相关性增强预测并分类。所提模型在今日头条子数据集(TNEWS)和KUAKE-QIC数据集上的准确率分别为93.3%和83.2%,通过对比实验表明,所提模型在多标签文本分类任务上性能得到了有效提升。Multi-label text classification is an important part of multi-label classification.Traditional multi-label text classification algorithms often only focus on the information of the text itself but cannot understand the deep semantic information,and do not consider the relationship between labels.To address these issues,a multi-label text classification model integrating BERT(Bidirectional Encoder Representation from Transformers)-GAT(Graph Attention neTwork)-CorNet(Correlation Network)was proposed.Firstly,the feature vectors of the text were represented by the pre-trained model BERT,and the generated feature vectors were used to establish graph structure data.At the same time,GAT was used to assign different weights to different nodes.Finally,Softmax CorNet was applied to learn label correlation and then to enhance prediction and classification.The proposed model achieves accuracies of 93.3% and 83.2% on TNEWS and KUAKE-QIC datasets,respectively.Compared with the existing models,the proposed model achieves effective improvements in multi-label text classification tasks.

关 键 词:多标签文本分类 预训练模型 图结构数据 标签相关性 BERT 图注意网络 CorNet 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象