多粒度信息关系增强的多标签文本分类被引量：4

Multi-label Text Classification with Enhancing Multi-granularity Information Relations

作　　者：李芳芳[1] 苏朴真段俊文张师超毛星亮 LI Fang-Fang;SU Pu-Zhen;DUAN Jun-Wen;ZHANG Shi-Chao;MAO Xing-Liang(School of Computer Science and Engineering,Central South University,Changsha 410038,China;Institute of Big Data and Internet Innovation,Hunan University of Technology and Business,Changsha 410205,China)

机构地区：[1]中南大学计算机学院,湖南长沙410038 [2]湖南工商大学大数据与互联网创新研究院,湖南长沙410205

出　　处：《软件学报》2023年第12期5686-5703,共18页Journal of Software

基　　金：国家自然科学基金(62172449,61836016,71790615,62006251,62172441);湖南省自然科学基金(2021JJ30870,2021JJ40783);长沙市自然科学基金(kq2014134);国防科技重点实验室基金(6142101190302)。

摘　　要：基于深度学习的多标签文本分类方法存在两个主要缺陷:缺乏对文本信息多粒度的学习,以及对标签间约束性关系的利用.针对这些问题,提出一种多粒度信息关系增强的多标签文本分类方法.首先,通过联合嵌入的方式将文本与标签嵌入到同一空间,并利用BERT预训练模型获得文本和标签的隐向量特征表示.然后,构建3个多粒度信息关系增强模块:文档级信息浅层标签注意力分类模块、词级信息深层标签注意力分类模块和标签约束性关系匹配辅助模块.其中,前两个模块针对共享特征表示进行多粒度学习:文档级文本信息与标签信息浅层交互学习,以及词级文本信息与标签信息深层交互学习.辅助模块通过学习标签间关系来提升分类性能.最后,所提方法在3个代表性数据集上,与当前主流的多标签文本分类算法进行了比较.结果表明,在主要指标Micro-F1、Macro-F1、nDCG@k、P@k上均达到了最佳效果.Multi-label text classification methods based on deep learning lack multi-granularity learning of text information and the utilization of constraint relations between labels.To solve these problems,this study proposes a multi-label text classification method with enhancing multi-granularity information relations.First,this method embeds text and labels in the same space by joint embedding and employs the BERT pre-trained model to obtain the implicit vector feature representation of text and labels.Then,three multi-granularity information relations enhancing modules including document-level information shallow label attention(DISLA)classification module,word-level information deep label attention(WIDLA)classification module,and label constraint relation matching auxiliary module are constructed.The first two modules carry out multi-granularity learning from shared feature representation:the shallow interactive learning between document-level text information and label information,and the deep interactive learning between word-level text information and label information.The auxiliary module improves the classification performance by learning the relation between labels.Finally,the comparison with current mainstream multi-label text classification algorithms on three representative datasets shows that the proposed method achieves the best performance on main indicators of Micro-F1,Macro-F1,nDCG@k,and P@k.

关键词：注意力机制多标签文本分类标签关系多粒度信息

分类号：TP18[自动化与计算机技术—控制理论与控制工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

多粒度信息关系增强的多标签文本分类被引量：4

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

多粒度信息关系增强的多标签文本分类 被引量：4

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

多粒度信息关系增强的多标签文本分类被引量：4