检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:徐遥 何世柱 刘康[1,2] 张弛 焦飞[4] 赵军 XU Yao;HE Shizhu;LIU Kang;ZHANG Chi;JIAO Fei;ZHAO Jun(Institute of Automation,Chinese Academy of Science,Beijing 100190,China;University of Chinese Academy of Sciences,Beijing 100049,China;State Grid Tianjin Electric Power Company,Electric Power Research Institute,Tianjin 300384,China;China Electric Power Research Institute Co.Ltd,Beijing 100192,China)
机构地区:[1]中国科学院自动化研究所,北京100190 [2]中国科学院大学,北京100049 [3]国网天津市电力公司电力科学研究院,天津300384 [4]中国电力科学研究院有限公司,北京100192
出 处:《中文信息学报》2022年第10期54-62,共9页Journal of Chinese Information Processing
基 金:国网总部科技项目(5700-202012488A-0-0-00)。
摘 要:近年来,面向确定性知识图谱的嵌入模型在知识图谱补全等任务中取得了长足的进展,但如何设计和训练面向非确定性知识图谱的嵌入模型仍然是一个重要挑战。不同于确定性知识图谱,非确定性知识图谱的每个事实三元组都有着对应的置信度,因此,非确定性知识图谱嵌入模型需要准确地计算出每个三元组的置信度。现有的非确定性知识图谱嵌入模型结构较为简单,只能处理对称关系,并且无法很好地处理假负(false-negative)样本问题。为了解决上述问题,该文首先提出了一个用于训练非确定性知识图谱嵌入模型的统一框架,该框架使用基于多模型的半监督学习方法训练非确定性知识图谱嵌入模型。为了解决半监督学习中半监督样本噪声过高的问题,我们还使用蒙特卡洛Dropout计算出模型对输出结果的不确定度,并根据该不确定度有效地过滤了半监督样本中的噪声数据。此外,为了更好地表示非确定性知识图谱中实体和关系的不确定性以处理更复杂的关系,该文还提出了基于Beta分布的非确定性知识图谱嵌入模型UBetaE,该模型将实体、关系均表示为一组相互独立的Beta分布。在公开数据集上的实验结果表明,结合该文所提出的半监督学习方法和UBetaE模型,不仅极大地缓解了假负样本问题,还在多个任务中明显优于UKGE等当前最优的非确定性知识图谱嵌入模型。In recent years,embedding models for deterministic knowledge graph have made great progress in tasks such as knowledge graph completion.However,how to design and train embedding models for uncertain knowledge graphs is still an important challenge.Different from deterministic knowledge graphs,each fact triple of uncertain knowledge graph has a corresponding confidence.Therefore,the uncertain knowledge graph embedding model needs to accurately calculate the confidence of each triple.The existing uncertain knowledge graph embedding model with relatively simple structure can only deal with symmetric relations,and cannot handle the false-negative problem well.Aiming to solve the above problems,we first propose a unified framework for training uncertain knowledge graph embedding models.The framework uses a multi-model based semi-supervised learning method to train uncertain knowledge graph embedding models.In order to solve the problem of excessive noise in semi-supervised samples,we also use Monte Carlo Dropout to calculate the uncertainty of the model on the output results,and effectively filter the noisy data in semi-supervised samples according to this uncertainty.In addition,in order to better represent the uncertainty of entities and relationships in uncertain knowledge graph to deal with more complex relations,we also propose an uncertain knowledge graphs embedding model UBetaE based on Beta distribution,which represents both entities and relations as a set of mutually independent Beta distributions.The experimental results on the public dataset show that the combination of the semi-supervised learning method and UBetaE model proposed in this paper not only greatly alleviates the false-negative problem,but also significantly outperforms the current SOTA uncertain knowledge graph embedding models such as UKGE in multiple tasks.
关 键 词:知识图谱 非确定性知识图谱嵌入 半监督学习 BETA分布
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:13.59.193.179