面向新领域的事件抽取研究综述  被引量:7

A survey on event extraction in new domains

在线阅读下载全文

作  者:黄河燕[1,2,3] 刘啸 HUANG Heyan;LIU Xiao(School of Computer Science and Technology,Beijing Institute of Technology,Beijing 100081,China;Beijing Engineering Research Center of High-Volume Language Information Processing and Cloud Computing Applications,Beijing 100081,China;Southeast Academy of Information Technology,Beijing Institute of Technology,Putian 351100,China)

机构地区:[1]北京理工大学计算机学院,北京100081 [2]北京海量语言信息处理与云计算应用工程研究中心,北京100081 [3]北京理工大学东南信息技术研究院,福建莆田351100

出  处:《智能系统学报》2022年第1期201-212,共12页CAAI Transactions on Intelligent Systems

基  金:国家自然科学基金项目(U19B2020).

摘  要:在当前互联网时代,大量新领域下的非结构文本数据中蕴含了海量信息。面向新领域的事件抽取方法研究能快速地构建领域知识库,用于支撑基于知识的下游应用。但现有事件抽取系统的领域限定性强,在新领域中从零构建会极度依赖事件体系和标注数据的质量及规模,需要大量人力和专家知识来定制模板和标注语料。而且数据集中常见在相同的上下文中出现多个相关联的事件实例,对事件抽取和真实性检测产生了极大阻碍。本文针对面向新领域的事件抽取这一新兴研究领域进行综述,从事件模板推导、多实例联合事件抽取、事件真实性检测三个研究方向介绍了相关工作的研究现状,并对目前存在的重点和难点问题进行了讨论,指出了下一步需要开展的研究工作。In the current Internet era,numerous unstructured text data in new domains often contain high-volume information.Studies on event extraction in new domains can accelerate building of domain knowledge bases,supporting downstream knowledge-based applications.However,the existing event extraction methods have substantial limitations of the domain.Building event extraction systems from scratch in new domains will heavily depend on the quality and scale of event schemas and annotated data,requiring a lot of human efforts and expertise.Moreover,it is common in the datasets that multiple associated event instances often appear in the same context,heavily hindering event extraction and factuality prediction.This paper summarizes the emerging research field of event extraction in new domains and investigates current research status from three directions:event schema induction,collective event extraction,and event factuality prediction.In addition,this paper discusses the existing difficulties and challengings and indicates the potential research work to be carried out in the future.

关 键 词:事件抽取 新领域 信息抽取 事件模板推导 联合抽取 事件真实性检测 自然语言处理 知识库 

分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象