基于ABBSAC模型的中文事件抽取方法  

Chinese event extraction method based on ABBSAC model

在线阅读下载全文

作  者:陈泉林 贾珺[1] 樊硕 CHEN Quanlin;JIA Jun;FAN Shuo(Institute of War,Academy of Military Sciences,Beijing 100091,China)

机构地区:[1]军事科学院战争研究院,北京100091

出  处:《微电子学与计算机》2024年第5期57-66,共10页Microelectronics & Computer

基  金:国家自然科学基金(62003366);全军军事类研究生资助课题(JYKT911012022005)。

摘  要:事件抽取作为信息抽取的重要一环,是非结构化文本转化为有价值的结构化文本的主要方式。针对目前事件抽取模型普遍训练时间长、模型体量大等问题,提出了一个基于ABBSAC的中文事件抽取模型。通过ALBERT预训练模型缩减模型体量,采用BiSRU++捕捉文本内部关联信息,并融合注意力机制提升模型精度,最后以CRF的输出作为抽取结果。基于新浪新闻自主构建了语料集,进行了对比实验。在获得较高准确率、召回率以及F1值的基础上,该模型训练速度提高了约10%,模型参数量裁剪了约82%,证明了所提模型的先进性。同时,在ACE05和DUEE基准测评数据集上,与前沿方法相比较,将触发词抽取的F1值分别提升了1.7%、0.3%,将论元角色抽取的F1值分别提升了5.4%、0.1%,有效提升了中文事件抽取任务的效能。Event extraction,as an important part of information extraction,is the main way to transform unstructured text into valuable structured text.To address the problems of long training time and large model volume commonly found in current event extraction models,the paper proposes chinese event extraction model based on ABBSAC model.Reducing model size with ALBERT pre-trained models,using BiSRU++to capture the internal association information of the text,and incorporating the attention mechanism to improve the model accuracy,and finally using the output of CRF as the extraction result.Based on Sina news,a corpus is constructed independently and a comparative experiment is carried out.The model achieves higher precision,recall and F1-score with an increase in training speed of about 10%and a cut in the number of model parameters of about 82%,demonstrating the advancedness of the proposed model.Also on the ACE05 and DUEE benchmark datasets,the F1-score for trigger extraction are improved by 1.7%and 0.3%,respectively,and the F1-score for argument role extraction are improved by 5.4%and 0.1%,when compared with the frontier method,effectively improving the effectiveness of the chinese event extraction task.

关 键 词:中文事件抽取 ALBERT Bi-SRU++ 注意力机制 触发词抽取 论元角色抽取 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象