电子病历命名实体识别和实体关系抽取研究综述  被引量:127

An Overview of Research on Electronic Medical Record Oriented Named Entity Recognition and Entity Relation Extraction

在线阅读下载全文

作  者:杨锦锋[1] 于秋滨[2] 关毅[1] 蒋志鹏[1] 

机构地区:[1]哈尔滨工业大学语言技术中心网络智能研究室,哈尔滨150001 [2]哈尔滨医科大学附属第二医院病案室,哈尔滨150086

出  处:《自动化学报》2014年第8期1537-1562,共26页Acta Automatica Sinica

基  金:国家自然科学基金(60975077)资助~~

摘  要:电子病历(Electronic medical records,EMR)产生于临床治疗过程,其中命名实体和实体关系反映了患者健康状况,包含了大量与患者健康状况密切相关的医疗知识,因而对它们的识别和抽取是信息抽取研究在医疗领域的重要扩展.本文首先讨论了电子病历文本的语言特点和结构特点,然后在梳理了命名实体识别和实体关系抽取研究一般思路的基础上,分析了电子病历命名实体识别、实体修饰识别和实体关系抽取研究的具体任务和对应任务的主要研究方法.本文还介绍了相关的共享评测任务和标注语料库以及医疗领域几个重要的词典和知识库等资源.最后对这一研究领域仍需解决的问题和未来的发展方向作了展望.Electronic medical records (EMRs) are generated in the process of clinical treatments. Named entities and entity relations in EMRs reflect patients0 health conditions and represent patients0 personalized medical knowledge. Conse-quently, named entity recognition and entity relation extraction on EMR are important expansion of information extraction in the medical domain. In this paper, the language characteristic and structure features of EMR narratives are firstly discussed, and then general methods for named entity recognition and relation extraction are sketched out. Furthermore, this paper introduces and analyzes the tasks and corresponding methods for named entity recognition, entity assertion recognition and relation extraction of EMR in detail. Related shared evaluation tasks and annotated corpora as well as several important dictionaries and knowledge bases are also introduced. Finally, problems to be handled and future research directions are proposed.

关 键 词:电子病历 命名实体识别 实体关系抽取 共享评测任务 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象