两阶段问答范式的生物医学事件触发词检测  

Biomedical Event Trigger Detection Based on Two-Stage Question Answering Paradigm

在线阅读下载全文

作  者:行帅 熊玉洁 苏前敏[1] 黄继汉[2] XING Shuai;XIONG Yujie;SU Qianmin;HUANG Jihan(School of Electronic and Electrical Engineering,Shanghai University of Engineering Science,Shanghai 201620,China;Center for Drug Clinical Research,Shanghai University of Traditional Chinese Medicine,Shanghai 201203,China)

机构地区:[1]上海工程技术大学电子电气工程学院,上海201620 [2]上海中医药大学药物临床研究中心,上海201203

出  处:《计算机工程与应用》2024年第10期121-131,共11页Computer Engineering and Applications

基  金:上海市科技创新行动计划技术标准项目(21DZ2203100);国家自然科学基金(62006150)。

摘  要:现有的生物医学事件触发词检测存在以下缺陷:保留了与触发词无关的冗余信息;忽略了实体与事件之间的潜在关联性;传统方法容易受到数据稀缺性的影响。针对上述问题,提出了一种两阶段问答范式的生物医学事件触发词检测方法。在事件类型识别阶段,采用基于句法距离的注意力捕获更有意义的上下文特征,排除无关信息的干扰;为了有效利用实体中的潜在特征,采用全局统计的单词-实体-事件共现特征,指导事件类型感知注意力挖掘词与事件之间的强关联性。在触发词定位阶段,根据识别出的事件类型,制定问题回答该事件对应的触发词索引,从而利用丰富的问答数据库实现数据增强。在MLEE语料库上的结果表明,两阶段问答范式、句法距离和事件类型感知注意力都有效地提升了模型性能,所提出的模型取得了81.39%的F1分数,并在多个事件类型上的详细结果均优于其他基线模型。The existing biomedical event trigger detection methods have the following defects:Redundant information unrelated to triggers are retained;potential correlations between entities and events are ignored;traditional methods are vulnerable to data scarcity.A biomedical event trigger detection based on two-stage question answering paradigm is pro-posed to address the above problems.In the event type identification phase,in order to exclude the interference of irrele-vant information,the attention based on syntactic distance is allowed to capture more meaningful contextual features.In order to effectively utilize the potential features in the entities,the word-entity-event co-occurrence feature based on global statistics is used to guide event type aware attention to explore the strong relationship between words and events.In the trigger localization phase,the trigger index of the event in the sentence is answered according to the identified event type questions,thus leveraging the rich question answering database to achieve data enhancement.The results on the MLEE corpus show that the two-stage question answering paradigm,syntactic distance attention,and event type aware attention effectively improve the performance of the model,and the proposed model achieves 81.39%F1-score,outperforming other baseline models in terms of detailed results for multiple event types.

关 键 词:生物医学事件 触发词检测 句法距离 单词-实体-事件共现特征 两阶段问答范式 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象