基于临床病历预训练语言模型的病历文本纠错探索  被引量:1

Exploration of medical record text error correction based on clinical medical record pre-training language model

在线阅读下载全文

作  者:奈存剑 杨亮 陈文昌 李林峰 任宇飞 汪火明 张晓祥[1] NAI Cun-jian;YANG Liang;CHEN Wen-chang;LI Lin-feng;REN Yu-fei;WANG Huo-ming;ZHANG Xiao-xiang(Tongji Hospital,Tongji Medical College,Huazhong University of Science and Technology,Wuhan 430030,Hubei Province,China;Yidu Cloud(Beijing)Technology Co.,Ltd.,Beijing 100039,China)

机构地区:[1]华中科技大学同济医学院附属同济医院,湖北武汉430030 [2]医渡云(北京)技术有限公司,北京100039

出  处:《中华医学图书情报杂志》2022年第10期27-32,共6页Chinese Journal of Medical Library and Information Science

基  金:国家卫生健康委标准修制订项目“临床医学术语标准协作开发机制研究”(2020090)。

摘  要:电子病历文本中存在错别字既不符合国家电子病历管理规范,又降低了自然语言处理技术的效果,影响了电子病历的价值挖掘与应用。阐述了一种基于在大量真实病历语料上训练出的预训练语言模型进行自动纠错的方法。实验证明,该方法在仿真数据集和真实病历数据集上检错和纠错都取得了很好的效果,运行效率很高,可以支持事中和事后的电子病历纠错,有效提升电子病历质量,推动电子病历的应用。The presence of misspelled words in electronic medical records(EMR)is not only inconsistent with the national electronic medical record management norms,but also reduces the effectiveness of natural language processing techniques,which in turn affects the value mining and application of EMR.A method of automatic typo correction based on a pre-trained language model trained on a large corpus of real-world medical records was elaborated in this paper.Experiments showed that the method performed well in error detection and correction on both simulated and real-world datasets.The method operates with high efficiency and can support error correction of EMR during and after the event,which can effectively improve the quality and thus promote the application of EMR.

关 键 词:电子病历 文本纠错 深度学习 预训练语言模型 

分 类 号:R197[医药卫生—卫生事业管理] TP391.1[医药卫生—公共卫生与预防医学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象