基于BRL神经网络模型的名家医案实体识别被引量：1

Entity Recognition in Famous Medical Records Based on BRL Neural Network Model

作　　者：杨航彭叶辉[1] 杨伟[2] 王嘉恒赵志伟徐文源李钰欣朱彦[3] 刘丽红[3] YANG Hang;PENG Yehui;YANG Wei;WANG Jiaheng;ZHAO Zhiwei;XU Wenyuan;LI Yuxin;ZHU Yan;LIU Lihong(School of Mathematics and Computational Science,Hunan University of Science and Technology,Xiangtan 411201,China;Institute of Basic Research in Clinical Medicine,China Academy of Chinese Medical Sciences,Beijing 100700,China;Institute of Information on Traditional Chinese Medicine,China Academy of Chinese Medical Sciences,Beijing 100700,China)

机构地区：[1]湖南科技大学数学与计算科学学院,湖南湘潭411201 [2]中国中医科学院中医临床基础医学研究所,北京100700 [3]中国中医科学院中医药信息研究所,北京100700

出　　处：《中国实验方剂学杂志》2024年第24期167-173,共7页Chinese Journal of Experimental Traditional Medical Formulae

基　　金：国家重点研发计划项目(2023YFC3503404);中国中医科学院自主选题项目(Z0643)。

摘　　要：目的:提高医案文本中命名实体的识别准确率,实现对医案知识的有效挖掘和利用,针对医案文本特点,构建一种Bert-Radical-Lexicon(BRL)神经网络模型识别医案实体。方法:从《中华历代名医医案全库》中选取408篇与高血压病相关的医案,并通过人工标注构建一个包含1672条医案语料的数据集。随后,将这些语料随机分为3个子集,即训练集(1004条)、测试集(334条)和验证集(334条)。以此为基础,构建融合多种医案文本特征的BRL模型,及其变体模型BRL-B、BRL-L、BRL-R,以及一个基线模型Base。在模型训练阶段,利用训练集对上述模型进行训练,为了减少过拟合的风险,在训练过程中持续监控各模型在验证集上的表现,并保存效果最优的模型。最后,在测试集上评估这些模型的性能。结果:与其他模型比较,BRL模型在医案命名实体识别任务中的性能最优,对疾病、症状、舌象、脉象、证候、治法、方剂及中药共8类实体的整体识别精确率为90.09%,召回率为90.61%,精确率与召回率的调和平均数(F1)为90.35%。BRL模型较Base模型,对实体识别的整体F1提升了5.22%,其中对脉象实体F1提升了6.92%,提升幅度最大。结论:通过在嵌入层融入多种医案文本特征,BRL神经网络模型具有更强的命名实体识别能力,进而提取更准确可靠的中医临床信息。Objective:In order to improve the recognition accuracy of named entities in medical record texts and realize the effective mining and utilization of medical record knowledge,a Bert-Radical-Lexicon(BRL)neural network model is constructed to recognize medical record entities with respect to the characteristics of medical record texts.Method:We selected 408 medical records related to hypertension from the the Complete Library of Famous Medical Records of Chinese Dynasties and constructed a dataset consisting of 1672 medical records by manually labeling.Then,we randomly divided the dataset into three subsets,including the training set(1004 cases),the testing set(334 cases)and the validation set(334 cases).Based on this dataset,we built a BRL model that fused various text features of medical records,as well as its variants BRL-B,BRL-L and BRL-R,and a baseline model Base for experiments.During the model training phase,we trained the above models using the training set to reduce the risk of overfitting.We continuously monitored the performance of each model on the validation set during training and saved the model with the best performance.Finally,we evaluated the performance of these models on the testing set.Result:Compared with other models,the BRL model had the best performance in the medical records named entity recognition task,with an overall recognition precision of 90.09%,a recall of 90.61%,and the harmonic mean of the precision and recall(F1)of 90.35%for eight types of entities,including disease,symptom,tongue manifestation,pulse condition,syndrome,method of treatment,prescription and traditional Chinese medicine(TCM).Compared with the Base model,the BRL model improved the overall F1 value of entity recognition by 5.22%,and the F1 value of pulse condition entity increased by 6.92%,which was the largest increase.Conclusion:By incorporating a variety of medical record text features in the embedding layer,the BRL neural network model has stronger named entity recognition ability,and thus extracts more accurate an

关键词：命名实体识别预训练模型部首嵌入关联词嵌入名家医案

分类号：R22[医药卫生—中医基础理论] R28[医药卫生—中医学] R249[自动化与计算机技术—控制理论与控制工程] TP183[自动化与计算机技术—控制科学与工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于BRL神经网络模型的名家医案实体识别被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于BRL神经网络模型的名家医案实体识别 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于BRL神经网络模型的名家医案实体识别被引量：1