融合多尺度CNN和CRF的通用细粒度事件检测  

General Fine-grained Event Detection Based on Multi-scale CNN and CRF

在线阅读下载全文

作  者:任永功[1] 阎格 何馨宇 REN Yonggong;YAN Ge;HE Xinyu(School of Computer and Information Technology,Liaoning Normal University,Dalian 116081,China;Information and Communication Engineering Postdoctoral Research Station,Dalian University of Technology,Dalian 116081,China;Postdoctoral Workstation of Dalian Yongjia Electronic Technology Co.,Ltd,University,Dalian 116081,China)

机构地区:[1]辽宁师范大学计算机与信息技术学院,辽宁大连116081 [2]大连理工大学通信与工程博士后研究站,辽宁大连116081 [3]大连永佳电子技术有限公司博士后工作站,辽宁大连116081

出  处:《小型微型计算机系统》2024年第4期859-864,共6页Journal of Chinese Computer Systems

基  金:国家自然科学基金项目(62006108,61976109)资助;中国博士后科学基金面上项目(2022M710593)资助;辽宁省“兴辽英才计划”项目(XLYC2006005)资助;辽宁省高等学校科学研究项目(LJKZ0963)资助:辽宁省科技厅重点研发项目(2022JH2/101300271)资助;辽宁省科技厅自然科学项目(2021-BS-201)资助;辽宁省重点实验室项目(LNZDSYS2000016)资助.

摘  要:事件检测是自然语言处理领域中事件抽取的主要任务之一,它旨在从众多非结构化信息中自动提取出结构化的关键信息.现有的方法存在特征提取不全面、特征分布不均等情况.为了提高事件检测的准确率,提出了一种融合BERT预训练模型与多尺度CNN的神经网络模型(BMCC,BERT+Multi-scale CNN+CRF).首先通过BERT(Bidirectional Encoder Representations from Transformers)预训练模型来进行词向量的嵌入,并利用其双向训练的Transformer机制来提取序列的状态特征;其次使用不同尺度的卷积核在多个卷积通道中进行卷积训练,以此来提取不同视野的语义信息,丰富其语义表征.最后将BIO机制融入到条件随机场(CRF)来对序列进行标注,实现事件的检测.实验结果表明,所提出的模型在MAVEN数据集上的F1值为65.17%,表现了该模型的良好性能.Event detection is one of the main tasks of event extraction in natural language processing,which aims to extract key information from unstructured text automatically.But there are some problems such as insufficient feature extraction and scattered features.In order to improve the accuracy of the event detection,a neural network(BMCC,BERT+Multi-Scale CNN+CRF)is proposed.Firstly,it embedded the word vector by BERT(Bidirectional Encoder Representations from Transformers)pre-training model and extracted the state features of the sequence by the transformer mechanism of two-way training.Secondly,it used convolution kernels of different scales to perform convolution training in multi-convolution channels,which can obtain semantic information of different horizons and enrich semantic representation information.Finally,the BIO mechanism was integrated into the conditional random field(CRF)to mark the sequence and realize the event detection.The experimental results show that the F1 value on the MAVEN dataset is 65.17%,which shows the good performance of the model.

关 键 词:事件检测 BERT 多尺度CNN 条件随机场(CRF) 交叉验证 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象