基于膨胀卷积迭代与注意力机制的实体名识别方法  被引量:4

Entity Name Recognition Method Based on Dilated Convolutional Iterative and Attention Mechanism

在线阅读下载全文

作  者:吕江海 杜军平[1] 周南 薛哲 LÜJianghai;DU Junping;ZHOU Nan;XUE Zhe(Beijing Key Laboratory of Intelligent Communication Software and Multimedia,School of Computer Science,Beijing University of Posts and Telecommunications,Beijing 100876,China)

机构地区:[1]北京邮电大学计算机学院智能通信软件与多媒体北京市重点实验室,北京100876

出  处:《计算机工程》2021年第1期58-65,71,共9页Computer Engineering

基  金:国家自然科学基金(61772083,61532006);广西科技重大专项(AA18118054)。

摘  要:针对传统实体名识别方法无法兼顾文本序列提取特征的有效性和神经网络模型训练速度的问题,提出一种基于迭代膨胀卷积神经网络(IDCNN)与注意力机制(ATT)的实体名识别方法。IDCNN可利用GPU并行计算的优化能力,保留长短期记忆神经网络的特性,即用简单的结构记录尽可能多的输入信息,并在准确提取文本序列特征的同时加快神经网络模型的训练速度。通过引入ATT运用文本语法信息和单词词性信息,从众多文本特征中选择对实体名识别更关键的特征,从而提高文本特征提取的准确性。在新闻数据集和微博数据集上的实验结果表明,神经网络模型的训练速度比传统的双向长短期记忆神经网络有显著提升,基于注意力的实体名识别方法的评价指标比传统的无注意力机制方法提高2%左右。The traditional entity name recognition methods fail to balance the effectiveness of feature extraction of text sequence and the training speed of neural network models.To address the problem,this paper proposes an entity name recognition method based on Iterated Dilated Convolutional Neural Network and Attention Mechanism(IDCNN-ATT).The IDCNN can fully utilize the optimization ability of GPU parallel computing,and retain the ability of Long Short-Term Memory(LSTM)neural network to remember as much information as possible based on simple structure.IDCNN can accurately extract the features of text sequences while greatly accelerating the training of neural network models.Moreover,the Attention Mechanism(ATT)is introduced to use the grammar information of the text and part-of-speech information of the words to select the features that are more critical to entity name recognition from multiple text features,which effectively improves the accuracy of text feature extraction.Experimental results show that the proposed attention-based method improves the indicators of entity name recognition by 2%compared with the traditional non-attention mechanism method.

关 键 词:实体名识别 注意力机制 膨胀卷积 长短期记忆网络 条件随机场 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象