融合实体邻域信息的知识图谱嵌入负采样方法  被引量:1

Knowledge Graph Embedding Negative Sampling Method Fused with Entity Neighborhood Information

在线阅读下载全文

作  者:翟社平 张宇航 柏晓夏 ZHAI Sheping;ZHANG Yuhang;BAI Xiaoxia(School of Computer Science and Technology,Xi'an University of Posts and Telecommunications,Xi'an 710121,China;Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing,Xi'an 710121,China)

机构地区:[1]西安邮电大学计算机学院,西安710121 [2]陕西省网络数据分析与智能处理重点实验室,西安710121

出  处:《计算机工程》2023年第3期95-104,共10页Computer Engineering

基  金:国家自然科学基金(61373116);工业和信息化部通信软科学项目(2018R26);陕西省重点研发计划项目(2022GY-038);陕西省大学生创新创业训练计划项目(S202111664077);西安邮电大学研究生创新基金(CXJJLY202027)。

摘  要:知识图谱嵌入的主要任务是将实体与关系嵌入低维、连续的向量空间。在模型训练过程中,必须同时提供正负三元组。已有的负采样方法多使用均匀随机采样方法构造负样本,通过这种方式获得的负样本对于模型的训练贡献很小。基于生成对抗网络,生成器能够采样更多可信的负三元组,增强嵌入模型性能。然而,离散数据在使用遗传算法时存在梯度消失的问题。针对以上问题,提出一种融合实体邻域信息的知识图谱嵌入负采样方法。该方法基于生成对抗网络的框架,通过图卷积神经网络聚合实体在不同关系路径上的邻域信息,用以辅助生成器产生高质量的负样本,提高鉴别器的性能。同时,在鉴别器部分引入Wasserstein距离代替传统的散度,解决梯度消失问题,加速模型收敛。在链接预测任务和三元组分类任务上对所提方法的有效性进行验证,结果表明,该方法在链接预测任务中MR、MRR、Hits@10较基线模型分别平均提升4.18、9.19、10.18个百分点,在三元组分类任务中准确率平均提升4.50个百分点,充分证明实体邻域信息的融入能够进一步提升负样本质量,显著提升模型性能。Knowledge Graph Embedding(KGE)embeds entities and relations into low-dimensional and continuous vector space.During model training,both positive and negative triples must be provided.Most of the existing negative sampling methods use uniform random sampling to construct negative samples,which have little contribution toward the training of the model.Inspired by Generative Adversarial Network(GAN),the generator can sample more plausible negative triples,which enhances the embedding model performance.However,discrete data exhibit vanishing gradients when using genetic algorithms.Therefore,this paper proposes a KGE negative sampling method fused with entity neighborhood information and uses the graph CNN to aggregate the neighborhood information of entities on different relation paths to generate high-quality negative samples and improve the performance of the discriminator.The Wasserstein distance is introduced to replace the traditional divergence to solve the gradient disappearance problem and accelerate the model convergence.Furthermore,the proposed method is evaluated on the link prediction task and triplet classification task.The results show that MR,MRR,and Hits@10 obtained by the proposed method are better compared to other baseline models in the link prediction task,with an average improvement of 4.18,9.19,and 10.18 percentage points,respectively.The accuracy rate in the triplet classification task increased by 4.50 percentage points on average,thereby confirming that the integration of entity neighborhood information can improve the quality of negative samples and the model performance.

关 键 词:知识图谱嵌入 生成对抗网络 邻域信息 图卷积神经网络 Wasserstein距离 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象