基于外部知识增强的远程监督关系抽取模型  被引量:2

Distantly-supervised Relation Extraction Model via External Knowledge Enhancement

在线阅读下载全文

作  者:曾碧卿 李砚龙 蔡剑 ZENG Bi-Qing;LI Yan-Long;CAI Jian(School of Software,South China Normal University,Foshan 528225,China)

机构地区:[1]华南师范大学软件学院,佛山528225

出  处:《计算机系统应用》2023年第5期253-261,共9页Computer Systems & Applications

基  金:国家自然科学基金面上项目(62076103);广东省基础与应用基础研究基金(2021A1515011171);广东省普通高校人工智能重点领域专项(2019KZDZX1033);广州市基础研究计划基础与应用基础研究项目(202102080282)。

摘  要:远程监督关系抽取方法旨在高效的构建大规模的监督语料并应用在关系抽取的任务上.但是由于远程监督构建语料的方式,带来了噪声标签和长尾分布两大问题.本文提出了一种新颖的远程监督关系抽取模型架构,与以往的基于管道的训练形式不同,除了句子编码器模块,新添加了外部知识增强模块.通过对知识库中已存在的实体类型与关系进行预处理和编码,为模型提供句包文本所没有的外部知识.有利于缓解数据集中存在部分长尾关系示例不足所导致的信息不足的问题,以及提升了模型对噪声示例的判别能力.通过在基准数据集NYT和GDS上的大量实验,相较于主流最优模型在AUC值上分别提升了0.9%和5.7%,证明了外部知识增强模块的有效性.The distantly-supervised relation extraction method aims to efficiently construct a large-scale supervised corpus and apply it to the task of relation extraction.However,constructing the corpus by distant supervision brings two major problems:noise labels and long tail distribution.In this study,a novel distantly-supervised relation extraction model is proposed.Unlike the previous pipeline-based training,an external knowledge enhancement module is added in addition to the sentence encoder module.By preprocessing and coding the existing entity types and relations in the knowledge base,the external knowledge that the sentence package text does not have is provided for the model.It is conducive to alleviating the problem of insufficient information caused by insufficient long tail relation instances in the data set and improving the discrimination ability of the model to noise instances.Through a large number of experiments on the benchmark data sets NYT and GDS,the AUC value has increased by 0.9%and 5.7%respectively,compared with the mainstream optimal model,which proves the effectiveness of the external knowledge enhancement module.

关 键 词:远程监督 关系抽取 图卷积神经网络 外部知识 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象