检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:程永 毛莺池 万旭[1,2] 王龙宝 朱敏 CHENG Yong;MAO Yingchi;WAN Xu;WANG Longbao;ZHU Min(Key Laboratory of Water Big Data Technology of Ministry of Water Resources,Nanjing 210098,China;College of Computer and Information,Hohai University,Nanjing 211100,China)
机构地区:[1]水利部水利大数据技术重点实验室,南京210098 [2]河海大学计算机与信息学院,南京211100
出 处:《计算机科学》2023年第1期276-284,共9页Computer Science
基 金:江苏省重点研发计划(BE2020729);中国华能集团关键技术(HNKJ19-H12,HNK20-H64)。
摘 要:事件抽取是自然语言处理的重要任务,而事件检测是事件抽取的关键步骤之一,其目标是检测事件的发生并对其进行分类。目前基于触发器识别的中文事件检测方法存在一词多义、词与触发词不匹配的问题,影响了事件检测模型的精度。针对此问题,提出基于双重注意力的无触发词事件检测模型(Event Detection Without Triggers based on Dual Attention,EDWTDA),该模型可跳过触发词识别过程,实现在无触发词标记情况下直接判断事件类型。EDWTDA利用ALBERT改善词嵌入向量的语义表示能力,缓解一词多义问题,提高模型预测能力;采用局部注意力融合事件类型捕捉句中关键语义信息并模拟隐藏的事件触发词,解决词与触发词不匹配的问题;借助全局注意力挖掘文档中的语境信息,解决一词多义问题;最后将事件检测转化成二分类任务,解决多标签问题。同时,采用Focal loss损失函数解决转化成二分类后产生的样本不均衡问题。在ACE2005中文语料库上的实验结果表明,所提模型相比最佳基线模型JMCEE在精确率、召回率和F1-score评价指标上分别提高了3.40%,3.90%,3.67%。Event extraction is an essential task of natural language processing,and event detection is one of the critical steps of event extraction,whose goal is to detect the occurrence of events and classify them.Currently,Chinese event detection has problems of polysemous words and mismatches between words and triggers,which affect the accuracy of event detection models.We propose the event detection without triggers based on dual attention(EDWTDA),which skips the process of trigger word recognition and directly determines event types without trigger word tags.First,the ALBERT model is applied to improve the semantic representation ability of word embedding vectors.Second,we fusion local attention and event types to capture key semantic information and simulate hidden event triggers to solve the problem of mismatch between words and triggers.Third,the global attention is introduced to mine contextual information in documents to solve the problem of polysemous words.Further,the event detection task is converted into a binary classification task for solving multi-label problem.Finally,the focal loss function is used to address the sample imbalance after conversion.Experimental results on the ACE2005 Chinese corpus show that compared with the best baseline model JMCEE,the accuracy rate,recall rate,and F1-score of the proposed model increases by 3.40%,3.90% and 3.67%,respectively.
关 键 词:双重注意力 无触发词 中文事件检测 ACE2005 二分类
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.189.186.244