基于双图神经网络信息融合的文本分类方法  被引量:2

Text Classification Method Based on Information Fusion of Dual-graph Neural Network

在线阅读下载全文

作  者:闫佳丹 贾彩燕[1,2] YAN Jia-dan;JIA Cai-yan(School of Computer and Information Technology,Beijing Jiaotong University,Beijing 100044,China;Beijing Key Lab of Traffic Data Analysis and Mining,Beijing Jiaotong University,Beijing 100044,China)

机构地区:[1]北京交通大学计算机与信息技术学院,北京100044 [2]交通数据分析与数据挖掘北京市重点实验室(北京交通大学),北京100044

出  处:《计算机科学》2022年第8期230-236,共7页Computer Science

基  金:中央高校基本科研业务费专项资金(2019JBZ110)。

摘  要:近年来,图神经网络在文本分类任务中得到了广泛应用。与图卷积网络相比,基于消息传递的文本级的图神经网络模型具有内存占用少和支持在线检测等优点。然而此类模型通常仅使用词共现信息为语料中的各个文本构建词汇图,导致获取到的信息缺少多样性。文中提出了一种基于双图神经网络信息融合的文本分类方法。该方法在保留原有词共现图的基础上,根据单词间的余弦相似度构建语义图,并通过阈值控制语义图的稀疏程度,更有效地利用了文本的多方位语义信息。此外,测试了直接融合和注意力机制融合两种方式对词汇图和语义图上学习到的文本表示融合的能力。实验使用R8和R52等12个文本分类领域常用的数据集来测试算法的精度,结果表明,与最新的TextLevelGNN,TextING和MPAD这3个文本级的图神经网络模型相比,双图模型能够有效提高文本分类的性能。Graph neural networks are recently applied in text classification tasks.Compared with graph convolution network,the text level graph neural network model based on message passing(MP-GNN)has the advantages of low memory usage and supporting online testing.However,MP-GNN model only builds a lexical graph using the word co-occurrence information,and the obtained information lacks diversity.To address this problem,a text classification method based on information fusion of dual-graph neural network is proposed.Besides preserving the original lexical graph built in MP-GNN,this method constructes the semantic graph based on the cosine similarity between pairs of words,and controls the sparsity of the graph through a threshold,which makes more effective use of the multi-directional semantic information of the text.In addition,the ability of direct fusion and attention mechanism fusion are tested to fuse the text representation learned on lexical graph and semantic graph.Experimental results on 12 datasets(R8,R52 and other datasets commonly used for text classification)show that the proposed model demonstrates an obvious improvement on performance of text classification compared with the SOTA(state-of-the-art)methods TextLevelGNN,TextING and MPAD.

关 键 词:文本分类 图神经网络 语义信息 信息融合 注意力机制 自然语言处理 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象