基于门控空洞卷积特征融合的中文命名实体识别被引量：9

Chinese Named Entity Recognition Based on Dilated Gated Convolution Feature Fusion

作　　者：杨长沛廖列法[1,2] YANG Changpei;LIAO Liefa(School of Information Engineering,Jiangxi University of Science and Technology,Ganzhou 341000,Jiangxi,China;School of Software Engineering,Jiangxi University of Science and Technology,Nanchang 330000,China)

机构地区：[1]江西理工大学信息工程学院,江西赣州341000 [2]江西理工大学软件工程学院,南昌330000

出　　处：《计算机工程》2023年第8期85-95,共11页Computer Engineering

基　　金：国家自然科学基金(71462018,71761018)。

摘　　要：在中文命名实体识别任务中,具有循环结构的长短时记忆网络模型通过捕捉时序特征解决长距离依赖问题,但其特征捕捉方式单一,信息获取能力有限。卷积神经网络通过使用多层卷积并行处理文本,能够提高模型运算速度,捕捉文本的空间特征,但简单地堆叠多个卷积层容易导致梯度消失。为同时获得多维度的文本特征且改善梯度消失问题,提出一种基于RoBERTa-wwm-DGCNN-BiLSTM-BMHA-CRF的中文命名实体识别模型,通过基于全词遮蔽技术的预训练语言模型RoBERTa-wwm把文本表征为字符级嵌入向量,捕捉深度上下文语义信息,并采用门控机制和残差结构对空洞卷积神经网络进行改进以降低梯度消失的风险。使用双向长短时记忆网络和门控空洞卷积神经网络分别捕捉文本的时序特征和空间特征,采用双线性多头注意力机制对多维度的文本特征进行动态融合,最后使用条件随机场对结果进行约束,获得最佳标记序列。实验结果表明,所提模型在Resume、Weibo和MSRA数据集上的F1值分别为97.20%、74.28%和95.74%,证明了该模型在中文命名实体识别中的有效性。In the task of Chinese Named Entity Recognition(NER),the long short-term memory network model with cyclic structure can solve the problem of long-distance dependence by capturing temporal features,but its feature capture method is singular and the information acquisition ability is limited.By using multi-layer convolution to process text in parallel,the Convolutional Neural Network(CNN)can improve the operation speed of the model and capture the spatial features of text.However,simply stacking multiple convolutional layers can easily lead to the gradient vanishing problem.To obtain multi-dimensional text features simultaneously and improve the gradient vanishing problem,this paper proposes a Chinese NER model based on RoBERTa-wwm-DGCNN-BiLSTM-BMHA-CRF.Firstly,text is represented as a character-level embedding vector by the pre-trained language model RoBERTa-wwm based on the whole-word masking technique to capture the deep contextual semantic information.Secondly,the gating mechanism and residual structure are used to improve the Dilated CNN(DCNN)to reduce the risk of gradient disappearance,and then the Bi-directional Long Short-Term Memory(BiLSTM)network and Dilated Gated CNN(DGCNN)are used to capture the temporal and spatial characteristics of the text,respectively.Thirdly,the Bi-linear Multi-Head Attention(BMHA)mechanism is used to dynamically fuse the multi-dimensional text features.Finally,the Conditional Random Field(CRF)is used to constrain the results and obtain the best marker sequence.The experimental results indicate that the F1 values of the proposed model on the Resume,Weibo,and MSRA data sets were 97.20%,74.28%and 95.74%,respectively,which proves the effectiveness of the proposed model for Chinese NER.

关键词：命名实体识别 RoBERTa-wwm模型空洞卷积注意力机制特征融合

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于门控空洞卷积特征融合的中文命名实体识别被引量：9

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于门控空洞卷积特征融合的中文命名实体识别 被引量：9

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于门控空洞卷积特征融合的中文命名实体识别被引量：9