面向中文医疗事件的联合抽取方法  被引量:5

Joint Extraction Method for Chinese Medical Events

在线阅读下载全文

作  者:余杰[1] 纪斌 刘磊 李莎莎[1] 马俊[1] 刘慧君 YU Jie;JI Bin;LIU Lei;LI Sha-sha;MA Jun;LIU Hui-jun(College of Computer,National University of Defense Technology,Changsha 410073,China;Institute of Logistics Science and Technology,Academy of Military Sciences,Beijing 100091,China)

机构地区:[1]国防科技大学计算机学院,长沙410073 [2]军事科学院后勤科学与技术研究所,北京100091

出  处:《计算机科学》2021年第11期287-293,共7页Computer Science

基  金:国家自然科学基金(61532001)。

摘  要:临床病历电子化的推广普及使得利用自动化的方法从病历中快速抽取高价值的信息成为可能。作为一种重要的医学信息,肿瘤医疗事件由描述恶性肿瘤的一系列属性构成。近年来,肿瘤医疗事件抽取已成为学术界的一个研究热点,众多学术会议将其发布为评测任务,并提供了一系列高质量的标注数据。针对肿瘤医疗事件属性离散的特点,文中提出了一种中文医疗事件的联合抽取方法,实现了肿瘤原发部位和原发肿瘤大小两种属性的联合抽取和肿瘤转移部位的抽取。此外,针对肿瘤医疗事件标注文本的数量和类型少的问题,提出了一种基于关键信息全域随机替换的伪数据生成算法,提升了联合抽取方法对不同类型肿瘤医疗事件抽取的迁移学习能力。所提方法获得了CCKS2020中文电子病历临床医疗事件抽取评测任务的第三名,在CCKS2019和CCKS2020数据集上的大量实验验证了所提方法的有效性。The popularization of electronic clinical medical records(EMRs)makes it possible to use automated ways to quickly extract high-value information from EMRs.As a kind of crucial medical information,tumor medical event is typically composed of a series of attributes describing malignant tumors.Recently,tumor medical event extraction has become a research hotspot in the academic community,and many influential academic conferences publish it as an evaluation task and provide a series of high-quality manually annotated data.Aiming at the discrete characteristic of tumor event attributes,this paper proposes a joint extraction method,which realizes the joint extraction of tumor primary site and primary tumor size and also the extraction of tumor metastasis sites.In addition,aiming to alleviate the small counts and types of annotated tumor medical texts,this paper proposes a pseudo-data generation algorithm based on the global random replacement of key information,which improves the transfer learning ability of the joint extraction method for different types of tumor events.The proposed method wins the third place in the clinical medical event extraction evaluation task of CCKS2020,and extensive experiments on CCKS2019 and CCKS2020 datasets verify the effectiveness of the proposed method.

关 键 词:中文电子病历 医疗事件抽取 迁移学习 联合抽取 肿瘤事件 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象