基于泛化图卷积神经网络的深度文档聚类模型  

Deep Document Clustering Model Based on Generalization Graph Convolutional Neural Network

在线阅读下载全文

作  者:柴变芳 李政 赵晓鹏[2] 王荣娟[3] Chai Bianfang;Li Zheng;Zhao Xiaopeng;Wang Rongjuan(College of Information Engineering,Hebei GEO University,Shijiazhuang 050031,China;Integrated system operation and maintenance center,Hebei Provincial Department of Finance,Shijiazhuang 050091,China;Hebei Vocational College of Geology,Shijiazhuang 050086,China)

机构地区:[1]河北地质大学信息工程学院,河北石家庄050031 [2]河北省财政厅一体化系统运维中心,河北石家庄050091 [3]河北地质职工大学,河北石家庄050086

出  处:《南京师大学报(自然科学版)》2024年第1期82-90,共9页Journal of Nanjing Normal University(Natural Science Edition)

基  金:河北省高等学校科学技术研究项目(ZD2020175);河北地质大学2023国家预研项目(KY202310).

摘  要:文本分类是自然语言处理中一项重要任务,基于图神经网络的文本分类因其可建模文本间的多种交互成为一种主流方法.但现有方法大都依赖标签,而真实标签难以获取.提出一个基于图泛化卷积神经网络的深度文档聚类模型(generalization graph convolutional neural network-deep document clustering, GGCN-DDC),同时实现文本表示学习和无监督文档分类.该模型首先将每个文档建模为文本图;然后采用泛化卷积层学习更有区分力的文档词特征表示和文档表示;最后通过文档聚类损失和文档图重建损失约束参数学习算法.在3个基准数据集上的实验表明,GGCN-DDC在多个指标上均优于其他基准算法.Text classification is an important task in natural language processing.The method of text classification on graph neural network has become a mainstream method since it can model the interactions among texts.However,most of the existing graph-based classification methods rely on real labels,which are difficult to captain.A deep document clustering model based on graph generalization convolutional neural network(GGCN-DDC)is proposed,which can realize unsupervised text classification while learning text representation.Firstly,the documents are modeled as a text graph.Then generalized convolution layer is used to learn the more distinguishable feature representations of words and the document representations.Finally,The learning algorithm of parameters is constrained by document clustering and reconstructing document graph.Experiments on three benchmark datasets show that GGCN-DDC outperforms other benchmark algorithms on several measures.

关 键 词:图神经网络 深度图聚类 文本分类 文本表示 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象