ABS-HDL:基于BIASRU的中文医学命名实体识别模型  

ABS-HDL:Chinese Medical Named Entity Recognition Model Based on BIASRU

在线阅读下载全文

作  者:盛萱妍 邵清[1] Xuanyan Sheng;Qing Shao(School of Optical-Electrical and Computer Engineering,University of Shanghai for Science and Technology,Shanghai)

机构地区:[1]上海理工大学光电信息与计算机工程学院,上海

出  处:《建模与仿真》2024年第4期4075-4089,共15页Modeling and Simulation

摘  要:中文医学命名实体识别旨在从中文非结构化医学文本中提取关键实体。针对模型训练时间长、传统字符向量处理方法容易忽视词边界等问题,提出了基于多头交互注意力的中文医学命名实体识别模型:ABS-HDL(ALBERT-BIASRU-SoftAttention-CRF Hybrid Deep Learning)。该方法首先使用ALBERT预训练模型分别获得词向量表示和字向量表示。其次,将字向量和词向量结合成一个字词向量矩阵。接着,本文提出了BIASRU语义提取层,通过将多头交互注意力融入到SRU中,实现了对字词向量矩阵特征的有效学习,并通过双向建模精确捕获序列上下文间的关系。此外,在软注意力机制权重分配层中,模型能够动态调整权重分配,增强了对实体边界的识别能力。最后,使用CRF解码层来优化标签序列的预测。实验结果表明,该模型在中文糖尿病数据集上与现有模型相比表现更好。Chinese medical named entity recognition aims to extract key entities from unstructured Chinese medical texts.Addressing issues such as the lengthy training time for models and the traditional character vector methods’tendency to overlook word boundaries,a Chinese medical named entity recognition model based on multi-head interactive attention is proposed:ABS-HDL(ALBERT-BIASRUSoftAttention-CRF Hybrid Deep Learning).This method initially employs the ALBERT pre-trained model to obtain separate word vector and character vector representations.Subsequently,it com-bines these vectors into a unified character-word vector matrix.Furthermore,this paper intro-duces the BIASRU semantic extraction layer,which integrates multi-head interactive attention in-to the SRU,effectively learning the features of the character-word vector matrix and precisely capturing the relationships within the sequence context through bidirectional modeling.Moreo-ver,in the soft attention mechanism weight allocation layer,the model dynamically adjusts the distribution of weights,enhancing the ability to recognize entity boundaries.Lastly,a CRF decod-ing layer is used to optimize the prediction of the label sequence.Experimental results demon-strate that this model performs better on a Chinese diabetes dataset compared to existing models.

关 键 词:命名实体识别 ALBERT 简单循环单元 多头交互注意力 软注意力 

分 类 号:G63[文化科学—教育学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象