融合指针网络与关系嵌入的三元组联合抽取模型被引量：3

Joint triple extraction model combining pointer network and relational embedding

作　　者：拓雨欣薛涛[1] TUO Yuxin;XUE Tao(School of Computer Science,Xi’an Polytechnic University,Xi’an Shaanxi 710600,China)

机构地区：[1]西安工程大学计算机科学学院,西安710600

出　　处：《计算机应用》2023年第7期2116-2124,共9页journal of Computer Applications

基　　金：陕西省技术创新引导计划项目(2020CGXNG-012)。

摘　　要：针对自然语言文本中实体重叠情况复杂、多个关系三元组提取困难的问题,提出一种融合指针网络与关系嵌入的三元组联合抽取模型。首先利用BERT(Bidirectional Encoder Representations from Transformers)预训练模型对输入句子进行编码表示;然后利用首尾指针标注抽取句子中的所有主体,并采用主体和关系引导的注意力机制来区分不同关系标签对每个单词的重要程度,从而将关系标签信息加入句子嵌入中;最后针对主体及每一种关系利用指针标注和级联结构抽取出相应的客体,并生成关系三元组。在纽约时报(NYT)和网络自然文本生成(WebNLG)两个数据集上进行了大量实验,结果表明,所提模型相较于目前最优的级联二元标记框架(CasRel)模型,整体性能分别提升了1.9和0.7个百分点;与基于跨度的提取标记方法(ETL-Span)模型相比,在含有1~5个三元组的对比实验中分别取得了大于6.0%和大于3.7%的性能提升,特别是在含有5个以上三元组的复杂句子中,所提模型的F1值分别提升了8.5和1.3个百分点,且在捕获更多实体对的同时能够保持稳定的提取能力,进一步验证了该模型在三元组重叠问题中的有效性。Aiming at the problems of complex entity overlap situations and difficulties in extracting multiple relational triples in natural language texts,a joint triple extraction model combining pointer network and relational embedding was proposed.Firstly,the BERT(Bidirectional Encoder Representations from Transformers)pre-training model was used to encode and represent the input sentence.Secondly,the head and tail pointer labeling was used to extract all subjects in the sentence,and the attention mechanism guided by subjects and relations was used to distinguish the importance of different relation labels to each word,so that the relation label information was added to the sentence embedding.Finally,for the subjects and each relation,the corresponding object was extracted by using the pointer labeling and cascade structure,and the relational triples were generated.Extensive experiments were conducted on two datasets,New York Times(NYT)and Web Natural Language Generation(WebNLG),and the results show that the proposed model has better overall performance than the current best Novel Cascade Binary Tagging Framework(CasRel)model by 1.9 and 0.7 percentage points respectively;compared with the Extract-Then-Label method with Span-based scheme(ETL-Span)model,the performance improvements of the proposed model are more than 6.0% and more than 3.7% in the comparison experiments with 1 to 5 triples,respectively.Especially in complex sentences with more than 5 triples,the proposed model has the F1 score improved by 8.5 and 1.3 percentage points respectively.And stable extraction ability of this model is maintained while capturing more entity pairs,which further verifies the effectiveness of this model in triple overlap problem.

关键词：信息提取重叠关系三元组提取 BERT 注意力机制深度学习

分类号：TP391.1[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

融合指针网络与关系嵌入的三元组联合抽取模型被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

融合指针网络与关系嵌入的三元组联合抽取模型 被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

融合指针网络与关系嵌入的三元组联合抽取模型被引量：3