检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:闫佳丹 贾彩燕[1,2] YAN Jia-dan;JIA Cai-yan(School of Computer and Information Technology,Beijing Jiaotong University,Beijing 100044,China;Beijing Key Lab of Traffic Data Analysis and Mining,Beijing Jiaotong University,Beijing 100044,China)
机构地区:[1]北京交通大学计算机与信息技术学院,北京100044 [2]交通数据分析与数据挖掘北京市重点实验室(北京交通大学),北京100044
出 处:《计算机科学》2022年第8期230-236,共7页Computer Science
基 金:中央高校基本科研业务费专项资金(2019JBZ110)。
摘 要:近年来,图神经网络在文本分类任务中得到了广泛应用。与图卷积网络相比,基于消息传递的文本级的图神经网络模型具有内存占用少和支持在线检测等优点。然而此类模型通常仅使用词共现信息为语料中的各个文本构建词汇图,导致获取到的信息缺少多样性。文中提出了一种基于双图神经网络信息融合的文本分类方法。该方法在保留原有词共现图的基础上,根据单词间的余弦相似度构建语义图,并通过阈值控制语义图的稀疏程度,更有效地利用了文本的多方位语义信息。此外,测试了直接融合和注意力机制融合两种方式对词汇图和语义图上学习到的文本表示融合的能力。实验使用R8和R52等12个文本分类领域常用的数据集来测试算法的精度,结果表明,与最新的TextLevelGNN,TextING和MPAD这3个文本级的图神经网络模型相比,双图模型能够有效提高文本分类的性能。Graph neural networks are recently applied in text classification tasks.Compared with graph convolution network,the text level graph neural network model based on message passing(MP-GNN)has the advantages of low memory usage and supporting online testing.However,MP-GNN model only builds a lexical graph using the word co-occurrence information,and the obtained information lacks diversity.To address this problem,a text classification method based on information fusion of dual-graph neural network is proposed.Besides preserving the original lexical graph built in MP-GNN,this method constructes the semantic graph based on the cosine similarity between pairs of words,and controls the sparsity of the graph through a threshold,which makes more effective use of the multi-directional semantic information of the text.In addition,the ability of direct fusion and attention mechanism fusion are tested to fuse the text representation learned on lexical graph and semantic graph.Experimental results on 12 datasets(R8,R52 and other datasets commonly used for text classification)show that the proposed model demonstrates an obvious improvement on performance of text classification compared with the SOTA(state-of-the-art)methods TextLevelGNN,TextING and MPAD.
关 键 词:文本分类 图神经网络 语义信息 信息融合 注意力机制 自然语言处理
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222