基于BERT-BILSTM的医疗文本关系提取方法  被引量:4

Relation Extraction Method of Medical Texts Based on BERT-BILSTM

在线阅读下载全文

作  者:龚汝鑫 余肖生[1] GONG Ru-xin;YU Xiao-sheng(School of Computer and Information,Three Gorges University,Yichang 443002,China)

机构地区:[1]三峡大学计算机与信息学院,湖北宜昌443002

出  处:《计算机技术与发展》2022年第4期186-192,共7页Computer Technology and Development

基  金:国家重点研究发展计划资助项目(2016YFC0802500)。

摘  要:健康医疗文本关系提取可充分利用医疗资源,为构建医院系统和相关知识图谱奠定基础,但健康医疗文本上下文联系紧密,内容结构复杂,使用传统的机器学习方法无法充分学习并利用文本中所包含的信息,且由于未针对文本中包含的医疗领域专业名词进行处理,使研究所需的重要实体流失,导致准确率不高。因此,提出了一种基于BERT和BILSTM融合的健康医疗文本关系提取方法,在预处理阶段进行医疗关键词提取,使用BERT语言模型进行词嵌入,再结合BILSTM和注意力机制进行特征处理,最后使用Softmax分类器输出类别概率值,确定实体间关系类别。基于两个临床医疗数据集的实验验证结果,与单向LSTM、CNN、BIGRU等模型进行比较分析,BERT-BILSTM-ATT模型表现最优,精确率提高3.35%以上、召回率提高1.28%以上、F1值提高2.58%以上,基于BERT和BILSTM融合的健康医疗文本关系提取方法能准确有效地预测健康医疗文本中实体之间存在的关系类别。Relation extraction method can make full use of medical resources in healthy medical texts and lay the foundation for the construction of hospital system and related knowledge graph.However,the context of healthy medical texts are closely related and the content structure is complex.Traditional machine learning methods cannot fully learn and use the information in the texts,and the medical domain terms are not processed in the texts.The important entities needed in the research are lost,resulting in low accuracy.Therefore,we propose a relation extraction method of healthy medical texts based on BERT and BILSTM.In the preprocessing stage,medical keywords are extracted,words are embedded by using the BERT language model,and then features are processed by BILSTM and attention mechanism.Finally,the Softmax classifier is used to output the probability value of the category to determine the relation category between entities.Based on the experimental results of two clinical data sets,compared with unidirectional LSTM,CNN,BIGRU and other models,BERT-BILSTM-ATT model showed the best performance,with the precision increased by more than 3.35%,the recall increased by more than 1.28%,and the F1-Score increased by more than 2.58%.The proposed relation extraction method of healthy medical texts based on BERT and BILSTM can accurately and effectively predict the relation categories between entities in healthy medical texts.

关 键 词:关系提取 双向长短期记忆神经网络 注意力机制 健康医疗文本 BERT 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象