检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:章振增 Zhang Zhenzeng(Linewell Software Co.,Ltd.,Quanzhou 362000,Fujian,China)
出 处:《计算机应用与软件》2023年第6期181-186,215,共7页Computer Applications and Software
摘 要:关于跨文档三元组(Subject Predicate Object,SPO)抽取任务,当前的研究主要基于句子级别的分析。然而很多场景下SPO元素可能分散于文档的各个位置,句子级别的抽取技术远远无法满足需求,因此提出一种Doc2SpSPO联合SPO抽取模型。该模型通过Span候选集模型生成初始实体信息,基于BERT-WWM预训练模型得到上下文以及候选实体相关Embedding信息进行分类任务从而实现SPO的联合提取。实验结果表明,该模型实体识别可达到F1值44.4%、关系分类准确率66.9%的较好效果。The current research of cross document subject predicate object(SPO)extraction task is mainly based on sentence level analysis.However,in many scenarios,SPO elements may be scattered in various locations of the document,and the current sentence level extraction technology is far from meeting the requirements.Therefore,we propose a Doc2SpSPO joint extraction of SPO model.In this model,the initial entity information was generated by Span candidate set model.Based on the pre-training model of BERT-WWM,the context and candidate entity related embedding information for classification tasks were obtained to achieve joint extraction of SPO.The experimental results show that this model s entity recognition achieved the F1 value of 44.4%and the relationship classification accuracy of 66.9%.
关 键 词:跨文档三元组抽取 BERT Span规则 联合实体关系抽取模型
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.170