融合知识图谱与多神经网络的文本分类模型  

A text classification model combining knowledge graph and multiple neural network

在线阅读下载全文

作  者:黎超 廖薇 LI Chao;LIAO Wei(School of Electronic and Electrical Engineering,Shanghai University of Engineering Science,Shanghai 201620,China)

机构地区:[1]上海工程技术大学电子电气工程学院,上海201620

出  处:《武汉大学学报(工学版)》2024年第12期1803-1812,共10页Engineering Journal of Wuhan University

基  金:国家自然科学基金项目(编号:62001282)。

摘  要:针对现有文本分类方法无法充分提取中文文本中的语义特征,从而影响分类效果的问题,提出一种融合知识图谱与多神经网络的文本分类模型KGMNN(knowledge graph and multiple neural network)。首先,该模型以Word2Vec作为嵌入层对文本进行向量化表示,利用多神经网络提取文本的全局语义特征与局部语义特征;其次,借助外部知识图谱获取文本相关概念集以丰富文本特征,并引入注意力机制计算每个概念权重值,降低了无关噪声概念对分类的影响;最后,将文本特征及其概念特征融合作用于Softmax分类器以得到分类结果。在THUCNews短文本数据集与长文本数据集上进行性能评估,实验结果表明,所提模型的分类准确率分别为96.67%和97.57%,与传统模型相比具有更好的分类性能。Aiming at the problem that existing text classification methods cannot fully extract the semantic features from Chinese text,which affects the classification effect,a text classification model combining knowledge graph and multiple neural network(KGMNN)is proposed.Firstly,the model uses Word2Vec as the embedding layer to represent the text vectorically,and uses multiple neural networks to extract the global and local semantic features of the text.Secondly,the external knowledge graph is used to obtain the text related concept set to enrich the text features,and the attention mechanism is introduced to calculate the weight value of each concept,so as to reduce the impact of irrelevant noise concepts on classification.Finally,the text feature and its concept feature are fused to Softmax classifier to get classification results.The performance of the proposed model is evaluated on THUCNews short text dataset and long text dataset,and the experimental results show that the classification accuracy of the proposed model is 96.67%and 97.57%,respectively,which has better classification performance than traditional models.

关 键 词:神经网络 注意力机制 知识图谱 中文文本分类 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象