检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张万里 佟安 李文桥 ZHANG Wanli;TONG An;LI Wenqiao(Unit 93209 of PLA,Beijing 100085,China;Computer School,Beijing Information Science and Technology University,Beijing 100101,China)
机构地区:[1]中国人民解放军93209部队,北京100085 [2]北京信息科技大学计算机学院,北京100101
出 处:《现代电子技术》2024年第21期91-96,共6页Modern Electronics Technique
摘 要:关系抽取任务可以从非结构化文本中抽取出实体对的关系信息,是信息抽取的核心任务。远程监督可以通过自动构建训练数据的方式降低人工的成本和压力,但原始语料本身存在数据不平衡的现象,导致长尾分布问题。针对这一问题,基于多示例学习的思想,提出一种基于约束图的远程监督长尾关系抽取方法。首先根据知识图谱本体结构构建约束图,利用图卷积神经网络对其进行编码;其次利用分段膨胀卷积神经网络和实体注意力机制对句子进行编码;最后结合上述编码信息进行分类预测。在公开数据集NYT10上,相较于主流最优模型在Hits@10、Hits@15和Hits@20上分别提高了约0.6%、1.5%和2.6%,证明了实体类型和关系之间的约束信息对远程监督长尾关系抽取的重要性。In the relation extraction task,the relationship information of entity pairs can be extracted from unstructured text.The relation extraction task is the core task of information extraction.Remote supervision can reduce labor costs and pressure by constructing training data automatically.However,the data imbalance occurs to the original corpus itself,which leads to the long-tailed distribution.In view of this,a distantly-supervised long-tailed relation extraction method on the basis of constraint graph is proposed based on the idea of multiple instance learning.A constraint graph is constructed based on the ontology structure of the knowledge graph,and then the constraint graph is encoded by a graph convolutional network(GCN).The sentences are encoded with segmented dilation CNN and entity attention mechanism.Classification prediction are implemented based on the above coded information.On the public dataset NYT10,the Hits@10,Hits@15 and Hits@20 of the proposed model are improved by approximately 0.6%,1.5%and 2.6%,respectively,in comparison with those of the mainstream optimal models.It is proved that the constraint information between entity types and relations is important for distantly-supervised long-tailed relation extraction.
关 键 词:关系抽取 远程监督 长尾分布 约束图 深度学习 知识图谱 注意力机制 膨胀卷积
分 类 号:TN911-34[电子电信—通信与信息系统] TP391.1[电子电信—信息与通信工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.148.200.70