基于RBAC模型的中文医疗命名实体识别被引量：1

Chinese Medical Named Entity Recognition Based on RBAC Model

作　　者：张斌[1] 赵婷婷[1] 张碧霞陈亚瑞[1] 王嫄 ZHANG Bin;ZHAO Tingting;ZHANG Bixia;CHEN Yarui;WANG Yuan(College of Artificial Intelligence,Tianjin University of Science&Technology,Tianjin 300457,China)

机构地区：[1]天津科技大学人工智能学院,天津300457

出　　处：《天津科技大学学报》2024年第5期56-62,共7页Journal of Tianjin University of Science & Technology

基　　金：国家自然科学基金项目(61976156);天津市企业科技特派员项目(20YDTPJC00560)。

摘　　要：中文医疗命名实体识别旨在从非结构化数据中抽取结构化实体,目前的主流研究都使用了大量的训练数据。针对中文医疗命名实体识别训练数据匮乏的问题,提出了基于联合分词的RBAC(RoBERTa-BiGRU-Attention-CRF)模型和基于语义搜索的命名实体识别数据增强方法。首先利用预训练模型和双向门控循环单元(BiGRU)提取文本的深度双向语义表示,再将该语义表示分别送入分词模块和命名实体识别模块。分词模块利用条件随机场(CRF)得到分词信息。命名实体识别模块利用BiGRU与多头注意力得到混合语义表示,再送入CRF得到命名实体识别的标签序列。在CCKS2019中文电子病历数据集上的实验结果表明,该方法在数据量较少的情况下F_(1)达到90.5%,证明了该方法的有效性。Chinese medical named entity recognition aims to extract structured entities from unstructured data.Current mainstream research uses a large amount of training data.Aiming at the problem of lack of training data for Chinese medical named entity recognition,a RoBERTa-BiGRU-Attention-CRF(RBAC)model based on joint segmentation and a novel data enhancement method for named entity recognition based on semantic search are proposed in this article.Specifically,the pretrained model and the Bidirectional Gated Recurrent Unit(BiGRU)are first used to extract the deep bidirectional semantic representation of the text,and then the semantic representation is sent to the word segmentation module and the named entity recognition module respectively.The word segmentation module uses conditional random fields(CRF)to obtain word segmentation information.The named entity recognition module uses BiGRU and multi-head attention to obtain a mixed semantic representation,and then is sent to CRF to obtain the tag sequence for named entity recognition.Experimental results on the CCKS2019 Chinese electronic medical record datasets showed that the F_(1) of this method reached 90.5%when the amount of data was small,thus proving the effectiveness of this method.

关键词：多任务学习预训练模型双向门控循环单元多头注意力条件随机场数据增强

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于RBAC模型的中文医疗命名实体识别被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于RBAC模型的中文医疗命名实体识别 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于RBAC模型的中文医疗命名实体识别被引量：1