基于深度神经网络和自注意力机制的医学实体关系抽取  被引量:10

Medical Entity Relation Extraction Based on Deep Neural Network and Self-attention Mechanism

在线阅读下载全文

作  者:张世豪 杜圣东[1] 贾真[1] 李天瑞[1] ZHANG Shi-hao;DU Sheng-dong;JIA Zhen;LI Tian-rui(School of Computing and Artificial Intelligence,Southwest Jiaotong University,Chengdu 611756,China)

机构地区:[1]西南交通大学计算机与人工智能学院,成都611756

出  处:《计算机科学》2021年第10期77-84,共8页Computer Science

基  金:四川省重点研发项目(2020YFG0035)。

摘  要:随着医学信息化的推进,医学领域已经积累了海量的非结构化文本数据,如何从这些医学文本中挖掘出有价值的信息,是医学行业和自然语言处理领域的研究热点。随着深度学习的发展,深度神经网络被逐步应用到关系抽取任务中,其中“recurrent+CNN”网络框架成为了医学实体关系抽取任务中的主流模型。但由于医学文本存在实体分布密度较高、实体之间的关系交错互联等问题,使得“recurrent+CNN”网络框架无法深入挖掘医学文本语句的语义特征。基于此,在“recurrent+CNN”网络框架基础之上,提出一种融合多通道自注意力机制的中文医学实体关系抽取模型,包括:1)利用BLSTM捕获文本句子的上下文信息;2)利用多通道自注意力机制深入挖掘句子的全局语义特征;3)利用CNN捕获句子的局部短语特征。通过在中文医学文本数据集上进行实验,验证了该模型的有效性,其精确率、召回率和F1值与主流的模型相比均有提高。With the advancement of medical informatization,a large amount of unstructured text data has been accumulated in the medical field.How to mine valuable information from these medical texts is a research hotspot in the field of medical profession and natural language processing.With the development of deep learning,deep neural network is gradually applied to relation extraction task,and“recurrent+CNN”network framework has become the mainstream model in medical entity relation extraction task.However,due to the problems of high entity density and the cross-connection of relationships between entities in medical texts,the“recurrent+CNN”network framework cannot deeply mine the semantic features of medical texts.Based on the“recurrent+CNN”network framework,this paper proposes a Chinese medical entity relation extraction model with multi-channel self-attention mechanism.It includes that BLSTM is used to capture the context information of text sentences,a multi-channel self-attention mechanism is used to mine the global semantic features of sentences,and CNN is used to capture the local phrase features of sentences.The effectiveness of the model is verified by experiments on Chinese medical text dataset.The precision,recall and F1 value of the model are improved compared with the mainstream models.

关 键 词:医学文本 实体关系抽取 多通道自注意力 深度学习 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象