结合语法规则和图神经网络的文本分类方法  被引量:1

Text Classification Method Combining Grammar Rules and Graph Neural Networks

在线阅读下载全文

作  者:郑诚[1,2] 肖双 ZHENG Cheng;XIAO Shuang(School of Computer Science and Technology,Anhui University,Hefei 230601,China;Key Laboratory of Computational Intelligence and Signal Processing,Ministry of Education,Hefei 230601,China)

机构地区:[1]安徽大学计算机科学与技术学院,合肥230601 [2]计算智能与信号处理教育部重点实验室,合肥230601

出  处:《小型微型计算机系统》2024年第11期2594-2601,共8页Journal of Chinese Computer Systems

基  金:安徽省重点研究与开发计划项目(202004d07020009)资助.

摘  要:图神经网络被广泛应用于文本分类任务,并取得了显著的效果.然而,现有基于图的文本分类模型存在全局上下文信息和局部特征信息提取不充分的问题.此外,现有方法在构建文本图时,仅在原始文本上使用滑动窗口建立单词之间的边,使模型无法捕捉到远距离的单词交互信息.针对上述问题,提出一种结合语法规则和图神经网络的文本分类模型.首先,在构建文本图时,除了使用滑动窗口在原始文本上建立单词间的边之外,还根据预定义的语法规则提取短语,以捕捉到远距离的单词交互信息;其次,利用Transformer编码器提取上下文信息,以丰富全局语义信息;同时,采用门控图神经网络提取文本的局部特征信息,以增强局部特征的表达能力.最后,将提取到的单词特征进行融合.在4个基准数据集上的实验结果验证了该模型相比于基线模型有较好的分类效果.Graph neural networks have gained significant attention and demonstrated remarkable performance in text classification tasks.However,existing graph-based text classification models suffer from limitations in extracting both global context information and local feature information.Moreover,current methods only utilize sliding windows on the original text to establish edges between words when constructing text graphs,thereby failing to capture long-distance word interactions.To address these challenges,this paper proposes a novel text classification model that combines grammar rules and graph neural networks.Firstly,when constructing the text graph,the proposed model goes beyond using sliding windows to establish word-to-word edges and also leverages predefined grammar rules to extract phrases,thereby capturing long-distance word interactions.Secondly,the Transformer encoder is employed to extract contextual information and enrich global semantic understanding.Simultaneously,a gated graph neural network is utilized to extract local feature information from the text,enhancing the model's ability to capture local patterns.Finally,the extracted word features are fused.Experimental results on four benchmark datasets validate the superior classification performance of the proposed model compared to the baseline models.

关 键 词:文本分类 图神经网络 文本表示 深度学习 自然语言处理 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象