检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:颜新月 杨淑群 高永彬 YAN Xinyue;YANG Shuqun;GAO Yongbin(School of Electronic and Electrical Engineering,Shanghai University of Engineering Science,Shanghai 201620,China)
机构地区:[1]上海工程技术大学电子电气工程学院,上海201620
出 处:《计算机应用》2024年第11期3379-3385,共7页journal of Computer Applications
基 金:上海市地方能力建设项目(21010501500);上海市“科技创新行动计划”社会发展科技攻关项目(21DZ1204900)。
摘 要:文档级关系抽取(DocRE)的目的是识别文档中实体对之间存在的所有关系。针对证据句子和文档信息未能被有效利用以及实体多提及的问题,在使用证据增强上下文特征的基础上,构建一种多特征融合的文档级关系抽取模型EMF(Evidence Multi-feature Fusion)。首先,在实体前后加上实体类型,将关系文本特征与实体提及进行关联,以获得特定于关系的实体特征。其次,通过不同卷积核获得片段表示,并通过注意力机制获得实体对感知的多粒度片段级特征;同时,利用证据分布增强与实体对高度相关的上下文特征。最后,融合以上特征进行关系分类,并在推理时将获得的证据组成伪文档与原文档一起输入分类器进行关系分类。在DocRE数据集DocRED(Document-level Relation Extraction Dataset)上的实验结果表明,使用BERTbase作为预训练语言模型编码器时,相较于先进模型EIDER(EvIDence-Enhanced DocRE),所提模型EMF的Ign F1和F1分别提高了0.42和0.41个百分点,F1达到了62.89%。EMF模型更关注与实体和关系相关的部分,可提高抽取的精度,并具有较好的可解释性。Document-level Relationship Extraction(DocRE)aims at identifying all the relationships that exist between entity pairs in a document.Aiming at the problems of ineffective use of evidence sentences as well as document information,and multiple mentions of entities,a multi-feature fusion DocRE model named EMF(Evidence Multi-feature Fusion)was constructed based on evidence-enhanced contextual features.Firstly,entity types were added before and after entities,and relationship text features were associated with entity mentions to obtain relationship-specific entity features.Secondly,fragment representations were obtained through different convolutional kernels,and multi-granularity fragment-level features perceived by entity pairs were obtained through the attention mechanism.Meanwhile,contextual features highly correlated with the entity pairs were enhanced by using evidence distribution.Finally,the above features were fused for relationship classification,and during inference,the obtained evidence was composed into a pseudo-document and input into the classifier together with the original document for relationship classification.Experimental results on DocRED(Document-level Relation Extraction Dataset),a DocRE dataset,show that when using BERTbase as the PLM encoder,compared with the state-of-the-art model EIDER(EvIDence-Enhanced DocRE),the EMF model has the Ign F1 and F1 improved by 0.42 and 0.41 percentage points respectively,and the F1 reached 62.89%.It can be seen that the EMF model pays more attention to the parts that are related to entities and relationships,improves the extraction accuracy,and has a good interpretability.
分 类 号:TP309.2[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7