检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张思邈 朱继召 刘颢 范纯龙[1] ZHANG Si-miao;ZHU Ji-zhao;LIU Hao;FAN Chun-long(Shenyang Aerospace University,School of Computer Science,Shenyang 110136,China;Shanghai Jiao Tong University,School of Electroric Information and Electrical Engineering,Shanghai 200240,China;Wuhan Digital Engineering Research Institute,Wuhan 430074,China)
机构地区:[1]沈阳航空航天大学计算机学院,辽宁沈阳110136 [2]海交通大学电子信息与电气工程学院,上海200240 [3]武汉数字工程研究所,湖北武汉430074
出 处:《中国电子科学研究院学报》2024年第1期84-90,共7页Journal of China Academy of Electronics and Information Technology
基 金:国家自然科学基金资助项目(62076249)。
摘 要:从非结构化文本中抽取实体关系三元组是自然语言处理中的主要任务形式之一。目前主流的方法是采用联合式抽取,能够在训练过程中自动捕捉到实体与关系间的依赖知识,提高了实体和关系的抽取效果。但这些方法忽略了实体的类型知识,导致大量的冗余计算和错误结果的产生。鉴于此,文中提出一种融合注意力和实体类型知识的实体关系联合抽取方法。首先,采用预训练模型BERT作为编码器得到句子中各字符的向量表示,再经双向LSTM层处理得到最终的语义表示;其次,基于表示层的结果完成头、尾实体的识别;接着,通过融合不同头实体的语义信息到句子表示中,实现头实体类型约束下的潜在语义关系发现;最后,将头实体和关系分别输入自注意力模块识别出对应尾实体,得到实体关系三元组。通过在公开数据集NYT和WebNLG上的大量实验表明:文中所提模型在实体关系联合抽取任务中的F1值达到了93.2%和93.3%,与当前主流模型相比提升显著。Entity-relation extraction from unstructured text has become a key task in natural language processing.At present,the mainstream methods adopt jointly extraction,which can automatically capture the dependent knowledge between entity and relation in the trainning process,and improve the extraction effect of entity and relation.However,these methods ignore the type knowledge of entiies,which leads to a lot of redundent calculations and icorrect results.To this end,we present a joint entity-relation extraction method that integrates self-attention mechanisem and entity type knowledge.Firstly,the pretrained model BERT is used as the encoder to get the vector representation of each character in the sentence,and then the final semantic representation is obtained through the bidirectional LSTM layer processing.Secondly,the head and tail entities are identified based on the results of the encoder layer.Then,the semantic representation of different head entities is iteratively integrated into the sentence representation to realize the potential semantic relation detection under the constraints of the type of head entities.Finally,input the head entity and relation respectively into the self-attention module to identify the corresponding tail entity and get the entity-relation triples.Experimental results on public datasets of NYT and WebNLG show that the F1 value of our proposed model in the entity-relation joint extraction task achiveves 93.2%and 93.3%,which is significantly improved compared with the current mainstream models.
关 键 词:自注意力机制 BERT 实体关系三元组 联合抽取
分 类 号:TN99[电子电信—信号与信息处理] TP391[电子电信—信息与通信工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15