检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:杨穗珠 刘艳霞[1] 张凯文 洪吟 黄翰[1] YANG Sui-Zhu;LIU Yan-Xia;ZHANG Kai-Wen;HONG Yin;HUANG Han(School of Software Engineering,South China University of Technology,Guangzhou 510641)
出 处:《计算机学报》2021年第8期1636-1660,共25页Chinese Journal of Computers
基 金:国家自然科学基金(61876208);广东省重点研发项目(2018B010109003);广州市科技计划(201802010007,201804010276)资助
摘 要:远程监督可以为关系抽取任务自动构建数据集,缓解了人工构建数据集的压力和成本,为自动关系抽取的实现奠定基础,然而使用远程监督方法构建的数据集存在错误标注以及长尾问题,严重影响关系抽取性能。目前,远程监督关系抽取任务的主要研究方向为关系模型的降噪手段以及对长尾关系的处理方法。近年来,随着深度学习技术的发展,这两个领域的研究工作也迎来了新一轮的机遇与挑战。本文对近几年远程监督关系抽取的研究进展进行综述,针对基于深度学习的远程监督关系抽取任务定义常用工作流,其中包括样本降噪、外部信息融合、编码器和分类器。本文根据不同的模块将已有的研究成果进行分类和梳理,分析比较主要方法,整理其中的关键问题,介绍已有的解决方案和相关数据集,总结远程监督关系抽取任务所用评测指标与评估方式,展望未来研究趋势。Relation extraction is a fundamental task in natural language processing and one of the essential parts of information extraction,whose dataset requires high cost due to manual labelling.Fortunately,distant supervision was proposed to alleviate the pressure and cost of manually annotated corpus,which can automatically build datasets for relation extraction task.Owing to its value in automatic relation extraction,it has been widely concerned by academia and business in recent years.However,the datasets constructed by distant supervision are not exactly equivalent to those generated manually.On the contrary,they suffer from the problem of wrong labelling and long tail distribution,resulting in their low quality,and thus hindering the improvement of relation extraction based on these datasets.Therefore,in order to reduce the impact,most of the existing work about distantly-supervised relation extraction(DSRE)focused on how to deal with the noise generated by wrong labelling problem and the long tail distribution.In recent years,deep learning technologies have developed rapidly such as deep neural network,attention mechanism,deep reinforcement learning and so on.Compared with traditional machine learning methods,e.g.feature-based methods,the application of deep learning methods has obvious advantages in relation extraction,as well as DSRE task.That is why DSRE is faced with a new round of opportunities and challenges.What’s more,as researches continue,a common workflow of this task was generated step by step.This paper summarizes the existing work in the field of DSRE,and pays more attention to the methods based on deep learning.This paper starts with an introduction of distant supervision as well as its vanilla assumption,analyzes the major shortcoming and reviews the methods based on traditional machine learning such as topic models and pattern correlation and so on.Then this paper introduces the general workflow with four modules,including sample collection,external information,encoder and classifier.According t
关 键 词:关系抽取 信息抽取 远程监督 降噪 长尾现象 错误标注
分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15