基于注意力机制特征融合的中文命名实体识别  被引量:8

Chinese Named Entity Recognition Based on Attention Mechanism Feature Fusion

在线阅读下载全文

作  者:廖列法[1] 谢树松 LIAO Liefa;XIE Shusong(School of Information Engineering,Jiangxi University of Science and Technology,Ganzhou 341000,Jiangxi,China)

机构地区:[1]江西理工大学信息工程学院,江西赣州341000

出  处:《计算机工程》2023年第4期256-262,共7页Computer Engineering

基  金:国家自然科学基金(71761018)。

摘  要:命名实体识别是自然语言处理领域中信息抽取、信息检索、知识图谱等任务的基础。在命名实体识别任务中,Transformer编码器更加关注全局语义,对位置和方向信息不敏感,而双向长短期记忆(BiLSTM)网络可以提取文本中的方向信息,但缺少全局语义信息。为同时获得全局语义信息和方向信息,提出使用注意力机制动态融合Transformer编码器和BiLSTM的模型。使用相对位置编码和修改注意力计算公式对Transformer编码器进行改进,利用改进的Transformer编码器提取全局语义信息,并采用BiLSTM捕获方向信息。结合注意力机制动态调整权重,深度融合全局语义信息和方向信息以获得更丰富的上下文特征。使用条件随机场进行解码,实现实体标注序列预测。此外,针对Word2Vec等传统词向量方法无法表示词的多义性问题,使用RoBERTa-wwm预训练模型作为模型的嵌入层提供字符级嵌入,获得更多的上下文语义信息和词汇信息,增强实体识别效果。实验结果表明,该方法在中文命名实体识别数据集Resume和Weibo上F1值分别达到96.68%和71.29%,相比ID-CNN、BiLSTM、CAN-NER等方法,具有较优的识别效果。Named Entity Recognition(NER)is the basis of information extraction and retrieval,knowledge mapping,and other tasks in the field of Natural Language Processing(NLP).In the NER task,the Transformer encoder pays more attention to global semantics and is insensitive to position and direction information,while the Bidirectional Long-Short Term Memory(BiLSTM)network can extract direction information from text but lacks global semantic information.To obtain global semantic and direction information simultaneously,a model of a dynamic fusion of the Transformer encoder and BiLSTM,using an attention mechanism,is proposed.The Transformer encoder is improved by using relative position coding and a modified attention calculation formula.The improved Transformer encoder is used to extract global semantic information,and the BiLSTM is used to capture direction information.Using the attention mechanism,the weight is dynamically adjusted,and the global semantic and direction information are deeply fused to obtain richer context features.By decoding the Conditional Random Field(CRF),the entity's annotation sequence prediction is realized.Furthermore,in view of the inability of Word2Vec and other traditional word vector methods to express the polysemy of words,RoBERTa-wwm pretraining model is used as the embedding layer to provide character-level embedding,obtain more contextual semantic and vocabulary information,and enhance the effect of entity recognition.The experimental results show that the F1 value of the proposed method is 96.68%and 71.29%respectively on the Chinese NER benchmark datasets,Resume and Weibo.Compared with ID-CNN,BiLSTM,CAN-NER,and other methods,the proposed method has a better recognition effect.

关 键 词:注意力机制 Transformer编码器 特征融合 中文命名实体识别 预训练模型 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象