基于集成学习的中文命名实体识别方法  被引量:6

Chinese Named Entity Recognition Method Based on Ensemble Learning

在线阅读下载全文

作  者:梁兵涛 倪云峰[2] Liang Bingtao;Ni Yunfeng(Hangzhou Youxing Technology CO.,LTD.,Zhejiang 310000,China;College of Communication and Infomation Engineering,Xi'an University of Science and Technology,Xi'an 710600,China)

机构地区:[1]杭州优行科技有限公司,浙江杭州310000 [2]西安科技大学通信与信息工程学院,陕西西安710600

出  处:《南京师大学报(自然科学版)》2022年第3期123-131,共9页Journal of Nanjing Normal University(Natural Science Edition)

摘  要:针对中文命名实体识别经典的BiLSTM-CRF(bi-directional long short-term memory-conditional random field)模型存在的嵌入向量无法表征多义词、编码层建模时注意力分散以及缺少对局部空间特征捕获的问题,本文提出一种融合BERT-BiGRU-MHA-CRF和BERT-IDCNN-CRF模型优势的集成模型完成命名实体识别.该方法利用裁剪的BERT模型得到包含上下文信息的语义向量;再将语义向量输入BiGRU-MHA(bi-directional gated recurrent unit-multi head attention)及IDCNN(Iterated Dilated Convolutional Neural Network)网络.前者捕获输入序列的时序特征并能够根据字符重要性分配权值,后者主要捕获输入的空间特征,利用平均集成方式将捕获到的特征融合;最后通过CRF层获得全局最优的标注序列.集成模型在人民日报和微软亚洲研究院(Microsoft research asia, MSRA)数据集上的F1值分别达到了96.09%和95.01%.相较于单个模型分别提高了0.74%和0.55%以上,验证了本文方法的有效性.Aiming at the problems existing in the classical BiLSTM-CRF(bi-directional long short-term memory-conditional random field)model of Chinese named entity recognition,such as the inability of the embedding vector cannot represent polysemy,the attention of the coding layer is distracted and lack of capturing local spatial features.This paper proposes an ensemble model that combines the advantages of the BERT-BiGRU-MHA-CRF and BERT-IDCNN-CRF models to complete named entity recognition.This method uses the BERT model to obtain a semantic vector containing contextual information,and then inputs the semantic vector into BiGRU-MHA(bi-directional gated recurrent unit-multi head attention)and IDCNN(Iterated Dilated Convolutional Neural Network)networks.The former captures the timing characteristics of the input sequence and can assign weights according to the importance of the characters,the latter mainly captures the spatial characteristics of the input,and uses the mean ensemble method to fuse the captured features.Finally,the global optimal annotation sequence is obtained through the CRF layer.The F1 values of the ensemble model on the datasets of People’s Daily and Microsoft Research Asia(MSRA)reached 96.09%and 95.01%,respectively.Compared with the single model,it has increased by more than 0.74%and 0.55%,respectively,which verifies the effectiveness of the method in this paper.

关 键 词:命名实体识别 BERT模型 集成学习 注意力机制 迭代膨胀卷积网络 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象