基于RoBERTa-CRF的肝癌电子病历实体识别研究  被引量:3

Study on Entity Recognition of Liver Cancer Electronic Medical Records Based on RoBERTa-CRF

在线阅读下载全文

作  者:邓嘉乐 胡振生 连万民[2] 华赟鹏[3] 周毅[1] DENG Jiale;HU Zhensheng;LIAN Wanmin;HUA Yunpeng;ZHOU Yi(Zhongshan School of Medicine,Sun Yat-sen University,Guangzhou 510080,China;Guangdong Second Provincial General Hospital,Guangzhou 510317,China;The First Affiliated Hospital of Sun Yat-sen University,Guangzhou 510080,China)

机构地区:[1]中山大学中山医学院,广州510080 [2]广东省第二人民医院,广州510317 [3]中山大学附属第一医院,广州510080

出  处:《医学信息学杂志》2023年第6期42-47,共6页Journal of Medical Informatics

基  金:国家重点研发计划(项目编号:2021YFC2009402);国家重点研发计划(项目编号:2022YFC3601600);广东省自然科学基金项目(项目编号:2021A1515011897)。

摘  要:目的/意义肝癌电子病历中蕴涵大量医学专业知识,且大部分以非结构化数据形式存在,难以自动化提取。肝癌电子病历实体识别研究有助于构建肝癌领域医疗辅助决策系统和医学知识图谱。方法/过程构建RoBERTa算法与CRF算法相结合的命名实体识别模型,利用自标注肝癌电子病历真实数据进行模型训练与测试。结果/结论RoBERTa-CRF模型优于其他基线模型,具有较好实体识别效果。Purpose/Significance Electronic medical records(EMR)of liver cancer contain a large amount of medical knowledge,and most of the knowledge is in the form of unstructured data which is difficult to extract automatically.The research on entity recognition of liver cancer EMR is important in the construction of clinical decision support systems and medical knowledge graphs in the area of liver cancer.Method/Process A named entity recognition(NER)model combined with RoBERTa algorithm and CRF algorithm is built,and the model achieves excellent effect.The real data of self-labeled EMR of liver cancer are used for model training and testing.Result/Conclusion RoBERTa-CRF model is better than other baseline models and has good entity recognition effect.

关 键 词:肝癌电子病历 实体识别 知识提取 RoBERTa-CRF模型 自然语言处理 

分 类 号:R-058[医药卫生]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象