结合GNN的信息融合用于归纳式文本分类  被引量:2

Information Fusion Combined with GNN is Used for Inductive Text Classification

在线阅读下载全文

作  者:郑诚[1,2] 倪显虎 张苏航 赵伊研 ZHENG Cheng;NI Xian-hu;ZHANG Su-hang;ZHAO Yi-yan(School of Computer Science and Technology,Anhui University,Hefei 230601,China;Key Laboratory of Computational Intelligence and Signal Processing,Ministry of Education,Hefei 230601,China)

机构地区:[1]安徽大学计算机科学与技术学院,合肥230601 [2]计算智能与信号处理教育部重点实验室,合肥230601

出  处:《小型微型计算机系统》2023年第6期1170-1176,共7页Journal of Chinese Computer Systems

摘  要:最近,图神经网络(GNN)通过将文本数据转换为图形数据的方式,来捕捉单词之间的固有拓扑结构和依赖信息,在一些文本分类任务中取得了良好的结果.但是将文本构建成图后,很多基于图结构的文本分类模型面临着全局上下文语义信息和局部特征信息提取不充分的等问题.本文提出了一种将全局上下文语义信息与局部特征信息相融合的图神经网络模型.通过将文档表示为有向、加权的词共现网络,其中有向是为了捕获词排序问题,权重是为了突出单词之间的相互影响程度,利用门控循环单元(GRU)在建模长距离单词交互上的优势,来捕获全局上下文语义信息,接着利用注意力(attention)捕获关键的局部特征信息,最后使用平均池化和最大池化进一步提升了模型对关键特征信息的提取能力,从而丰富了文档节点的全局语义信息,增强了局部特征表达.通过在三个经典英文数据集上的实验结果表明,该模型相比于基线模型有较好的分类效果.Recently,graph neural networks(GNN)have achieved good results in some text categorization tasks by capturing the inherent topological structure and dependency information between words by converting text data into graph data.However,many text classification models based on graph structure are faced with problems,such as insufficient extraction of global context semantic information and local feature information.In this paper,we propose a graph neural network model which integrates the global context semantic relationship with local feature information.We represent the document as a directionally weighted word co-occurrence network,The direction is to capture the word sorting problem,and the weight is to highlight the degree of interaction between words,the advantage of gated circulation unit(GRU)in modeling long-distance word interaction is utilized to capture global context semantic information,Then,attention is used to capture the deeper local dependency.Finally,average pooling and maximum pooling are used to further improve the ability of the model to extract key feature information,Thus the global semantic information of document nodes is enriched and the local feature expression is enhanced.Experimental results on three classical English data sets show that this model has better classification performance than the baseline model.

关 键 词:文本分类 图神经网络 门控循环单元 注意力 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象