检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:廖涛[1] 孙皓洁 张顺香[1] LIAO Tao;SUN Haojie;ZHANG Shunxiang(School of Computer Science and Engineering,Anhui University of Science and Technology,Huainan 232001,Anhui,China)
机构地区:[1]安徽理工大学计算机科学与工程学院,安徽淮南232001
出 处:《计算机工程》2023年第6期107-114,共8页Computer Engineering
基 金:国家自然科学基金面上项目(62076006);安徽省高校协同创新项目(GXXT-2021-008);安徽省自然科学基金面上项目(1908085MF189)。
摘 要:实体关系联合抽取模型在实体关系抽取中具有重要作用,针对现有的实体关系联合抽取模型无法有效识别重叠关系中的实体关系三元组问题,提出一种新型的基于跨度和特征融合的实体关系联合抽取模型SFFM。将文本输入BERT预训练模型转变为词向量,根据跨度进行词向量划分形成跨度序列,并基于卷积神经网络过滤跨度序列中不包含实体的跨度序列,使用双向长短时记忆提取剩余跨度序列融合文本信息后的特征并通过Softmax回归实现实体识别,将文本中的实体和关系映射到不同的跨度序列中,当重叠关系中的实体和距离较远的实体之间存在关系时,按照跨度进行划分使可能存在关系的实体对划分到同一个跨度序列中,以更好地利用文本中的重叠关系。在此基础上,通过注意力机制获取跨度序列中的依赖关系,运用Softmax回归对跨度序列中的关系进行分类。实验结果表明,与基线模型相比,该模型在CoNLL04数据集上的微平均和宏平均分别提升了1.87和1.73个百分点,在SciERC数据集上的微平均提升了5.95个百分点。The entity-relationship joint extraction model plays an important role in entity-relationship extraction;however,the existing entity-relationship joint extraction model cannot effectively identify entity-relation triples in overlapping relationships.This paper proposes a novel entity-relationship extraction model SFFM based on span and feature fusion.The model first converts the text input to the BERT pre-training model into word vectors.Then,it divides the word vectors based on the span to form a span sequence,filters the span sequences that do not contain entities based on Convolutional Neural Network(CNN),and uses Bi-directional Long Short-Term Memory(Bi-LSTM)to extract the features of the remaining span sequences.It uses Softmax regression to perform entity recognition.The span sequence formed by the division can map the entities and relationships in the text to different span sequences.When there is a relationship between an entity in an overlapping relationship and an entity with a long distance,the division is performed according to the span so that the entities that may have a relationship are paired with each other.Dividing into the same span sequence can effectively utilize the overlapping relationship proposed in this paper.Finally,the attention mechanism is used to obtain the dependencies in the span sequence,and Softmax regression is used to classify the relationships in the span sequence.The experimental results show that compared with the baseline model,the micro-average and macro-average of the CoNLL04 dataset increase by 1.87 and 1.73 percentage points,respectively,and the micro-average increases by 5.95 percentage points in the SciERC dataset.
关 键 词:联合抽取 实体关系抽取 神经网络 跨度 特征融合
分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.118.28.11