多特征融合的中文电子病历命名实体识别  被引量:5

Named Entity Recognition of Chinese Electronic Medical Records Based on Multi-Feature Fusion

在线阅读下载全文

作  者:孙振 李新福 SUN Zhen;LI Xinfu(College of Cyberspace Security and Computer,Hebei University,Baoding,Hebei 071000,China)

机构地区:[1]河北大学网络空间安全与计算机学院,河北保定071000

出  处:《计算机工程与应用》2023年第23期136-144,共9页Computer Engineering and Applications

摘  要:命名实体识别是自然语言处理中的基本任务。目前中文电子病历命名实体识别研究没有考虑到医疗文本结构复杂、数据集实体类型分布不均衡的情况,仅将通用领域的命名实体识别模型迁移到医疗领域,识别效果不佳。针对以上问题,提出多特征融合的中文电子病历命名实体识别模型。分别获取字、部首和四角向量,通过汉字字形丰富医疗文本的语义表示;利用实体标签标记策略,将向量中可能存在的实体类型进行标记,加强模型对不同类型文本数据的学习;将融合向量送入到Mogrifier GRU层,进一步加强特征表示语义间的联系,并利用CRF建立标签约束。实验表明,所提模型在CCKS2019数据集上的F1值达到88.72%,在MSRA数据集上达到95.44%,验证了模型的有效性。Named entity recognition is a basic task in natural language processing.Currently,the research on named entity recognition in Chinese electronic medical records does not consider the complex structure of medical texts and the uneven distribution of entity types in data sets.It only migrates the named entity recognition model from the general field to the medical field,and the recognition effect is not good.Aiming at the above problems,this paper proposes a multifeature fusion named entity recognition model for Chinese electronic medical records.Firstly,the characters,radicals,and quadrilateral vectors are obtained to enrich the semantic representation of medical texts through Chinese characters.Secondly,the entity label labeling strategy is used to label the entity types in the vector to enhance the model's learning of different text data types.Finally,the fusion vector is fed into the Mogrifier GRU layer to strengthen the relationship between feature representation semantics further,and CRF is used to establish label constraints.The experimental results show that the F1 value of the proposed model reaches 88.72% on the CCKS2019 dataset and 95.44% on the MSRA dataset,which verifies the effectiveness of the model.

关 键 词:电子病历 命名实体识别 多特征 门控循环单元(GRU) 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术] R197.323[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象