基于BERT的医患对话实体阴阳性自动标注研究  被引量:2

Research on the negative and positive automatic labeling of doctor-patient dialogue entities based on BERT

在线阅读下载全文

作  者:孙媛媛 申喜凤 李美婷 南嘉乐 张维宁 高东平[1] Sun Yuanyuan;Shen Xifeng;Li Meiting;Nan Jiale;Zhang Weining;Gao Dongping(Institute of Information on Medicine,Peking Union Medical College,Chinese Academy of Medical Sciences,Beijing 100020,China;Department of Internal Medicine,Peking Union Medical College Hospital,Peking Union Medical College,Chinese Academy of Medical Sciences)

机构地区:[1]中国医学科学院北京协和医学院医学信息研究所,北京100020 [2]中国医学科学院北京协和医学院北京协和医院内科学系

出  处:《中国数字医学》2022年第3期34-38,共5页China Digital Medicine

基  金:科技创新2030-“新一代人工智能”重大项目资助(2020AAA0104905)。

摘  要:目的:为智能医疗的网络问诊设计一个前端任务模块,对互联网真实医患对话文本进行自动标注研究,探索识别对话实体阴阳性准确率较高的方法。方法:对医患对话真实文本特点进行分析,选取BERT及BERT-WWM对医患对话真实文本中的实体进行嵌入向量化,再通过语义信息获取,最终对实体属性进行分类和计算,自动标注其阴阳性。结果:实验结果表明BERT-WWM在处理中文对话的实体分类标注时优于BERT约16%。结论:优先选择全词掩码,以单元(Unit)来替代以字为单位的掩码对医学类实体进行分类和标注,可大大提高原模型的准确度。Objective To design a front-end task module for the online consultation of intelligent health care,and explore the method to identify the positive and negative nature of dialogue entities with a higher accuracy by automatically labeling real doctor-patient dialogue texts on the Internet.Methods The characteristics of real text of doctor-patient dialogue were analyzed,BERT and BERT-WWM were selected to carry out embedded vectorization of entities in real text of doctor-patient dialogue,and then the entity attributes were classified and calculated through semantic information acquisition,and the positive and negative nature was automatically labeled.Results The experimental results show that BERT-WWM is 16%better than BERT when dealing with entity classification labeling of Chinese dialogues.Conclusion The accuracy of the original model can be greatly improved by preferred the whole word mask and used Unit classify and label medical entities instead of the word.

关 键 词:在线问诊 实体标注 BERT BERT-WWM 

分 类 号:R319[医药卫生—基础医学] TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象