面向中文病历处理的多图命名实体识别方法研究  

Multi-Graph Enabled Named Entity Recognition for Chinese Medical Records Processing

在线阅读下载全文

作  者:单涛 吴杰[3] 景慎旗 叶继元[1] 刘云[2] 郭永安[3] SHAN Tao;WU Jie;JING Shenqi;YE Jiyuan;LIU Yun;GUO Yong'an(School of Information Management,Nanjing University,Nanjing 210023,China;Jiangsu Province Hospital,Nanjing 210096,China;College of Telecommunications&Information Engineering,Nanjing University of Posts and Telecommunications,Nanjing 210003,China)

机构地区:[1]南京大学信息管理学院,江苏南京210023 [2]江苏省人民医院,江苏南京210096 [3]南京邮电大学通信与信息工程学院,江苏南京210003

出  处:《情报科学》2024年第3期100-109,117,共11页Information Science

基  金:江苏省前沿引领技术基础研究专项(BK20202001);江苏省重点研发计划(SJ221007)。

摘  要:【目的/意义】命名实体识别(NER)作为医疗记录处理的核心组成部分,对于提高电子病历处理的准确性和效率至关重要。尤其是在处理中文病历这一领域,由于中文的复杂性,NER任务面临更多挑战。因此,开发一种有效的中文病历命名实体识别模型,对于改进医疗记录的信息提取和数据处理流程具有重要价值。【方法/过程】文中提出了一个新型框架NER-CMR(中文病历命名实体识别),旨在克服现有NER方法在中文病历中的限制。NER-CMR框架通过结合流行的连续词和短语等上下文信息,解决传统NER中实体词嵌套和边界识别的问题。具体来说,该框架从相关词和短语中提取字符间的邻接、共现和依赖关系,这些信息随后被融合到NER神经模型中。NER-CMR包含字符编码模块、词嵌入模块、图形构建模块、融合模块和CRF模块。【结果/结论】通过在CCKS这个广泛使用的中文病历数据集与DIABETES真实糖尿病中文数据集上进行综合实验,NER-CMR展示了其在识别性能上优于基线模型的能力。此外,该模型作为一个引入图神经网络的中文NER任务处理框架,具有模块替换的灵活性,为中文电子病历命名实体识别研究领域提供了新的发展方向。【创新/局限】提出了基于图注意力机制的网络图,设计了融合层实现多图融合处理,进一步利用两种策略来应对不正确关系带来的噪音问题,但缺乏智慧医疗系统应用层面的实例研究。【Purpose/significance】Named entity recognition(NER),as a core component of medical record processing,is crucial to im⁃proving the accuracy and efficiency of electronic medical record processing.Especially in the field of processing Chinese medical re⁃cords,NER tasks face more challenges due to the complexity of Chinese.Therefore,developing an effective named entity recognition model for Chinese medical records is of great value for improving the information extraction and data processing process of medical re⁃cords.【Method/process】A novel framework NER-CMR(Chinese Medical Records Named Entity Recognition)is proposed,aiming to overcome the limitations of existing NER methods in Chinese medical records.The NER-CMR framework solves the problems of en⁃tity word nesting and boundary identification in traditional NER by combining contextual information such as popular continuous words and phrases.Specifically,the framework extracts adjacencies,co-occurrences,and dependencies between characters from re⁃lated words and phrases,and this information is subsequently fused into the NER neural model.NER-CMR includes character encod⁃ing module,word embedding module,graph building module,fusion module and CRF module.【Result/conclusion】Through compre⁃hensive experiments on CCKS,a widely used Chinese medical record dataset,and DIABETES real diabetes Chinese dataset,NERCMR demonstrated its ability to outperform the baseline model in recognition performance.In addition,as a Chinese NER task pro⁃cessing framework that introduces graph neural networks,this model has the flexibility of module replacement,providing a new devel⁃opment direction for the research field of named entity recognition in Chinese electronic medical records.【Innovation/limitation】A network graph based on graph attention mechanism is proposed,a fusion layer is designed to realize multi-graph fusion processing,and two strategies are further used to deal with the noise problem caused by incorrect relationships,but there is a lack o

关 键 词:命名实体识别 中文病历 邻接图 注意力机制 知识图谱 

分 类 号:G254[文化科学—图书馆学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象