电子医疗文档的流式动态脱敏实现研究  

Research on the realization of dynamic desensitization of electronic medical documents data through streaming processing

在线阅读下载全文

作  者:何剑虎[1] 伊胜月[1] 宋丽莹 HE Jianhu;YI Shengyue;SONG Liying(Women's Hospital School of Medicine Zhejiang University,Hangzhou 310006,Zhejiang Province,China)

机构地区:[1]浙江大学医学院附属妇产科医院,杭州310006

出  处:《中国数字医学》2025年第4期61-67,共7页China Digital Medicine

基  金:浙江省医药卫生科技项目——电子医疗文档个人敏感信息自动发现与脱敏系统研究(2022PY062)。

摘  要:目的:实现电子医疗文档在共享时的脱敏处理,保护患者隐私。方法:构建一个集成多种机器学习模型的医疗数据词法分析器,整理医疗健康领域的中文分词、词性标注和命名实体识别语料库,利用隐马尔可夫、条件随机场等自然语言处理技术和内置敏感信息特征库识别电子医疗文档中的敏感信息,并通过结果集流式处理技术实现动态脱敏。结果:算法模型在处理常规个人敏感信息时效果较好,个人敏感信息的发现与脱敏平均耗时在毫秒级别。结论:自然语言处理结合敏感信息特征库的方法可实现非结构化电子医疗文档敏感信息的识别与实时脱敏。Objective To achieve desensitization of electronic medical documents during sharing,and to protect patient privacy.Methods A lexical analyzer for medical data integrating multiple machine learning models was constructed to sort out Chinese word segmentation,part-of-speech tagging,and named entity recognition corpora in the field of medical and healthcare.Sensitive information in electronic medical documents was identified by using natural language processing technologies such as Hidden Markov Models and Conditional Random Fields and built-in sensitive information signature library,and dynamic desensitization was realized through result set streaming processing technology.Results The algorithm model has a good effect on the processing of routine sensitive personal information,with an average time of detection and desensitization of sensitive personal information was milliseconds.Conclusion The method of natural language processing with sensitive information signature library can realize the recognition and real-time desensitization of sensitive information in unstructured electronic medical documents.

关 键 词:电子医疗文档 敏感信息识别 动态脱敏 自然语言处理 流式处理 

分 类 号:R197.3[医药卫生—卫生事业管理] R319[医药卫生—公共卫生与预防医学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象