检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:范智渊 何璇[1,2] 梁品 吕晶 康雁[3] FAN Zhiyuan;HE Xuan;LIANGPin;LU Jing;KANG Yan(College of Medicine and Biological Information Engineering,Northeastern University,Shenyang 110819,P.R.China;Neusoft Research of Intelligent Healthcare Technology,Co.Ltd.,Shenyang 110819,P.R.China;College of Health Science and Environmental Engineering,Shenzhen Technology University,Shenzhen,Guangdong 518118,P.R.China)
机构地区:[1]东北大学医学与生物信息工程学院,沈阳110819 [2]沈阳东软智能医疗科技研究院有限公司,沈阳110819 [3]深圳技术大学健康与环境工程学院,广东深圳518118
出 处:《生物医学工程学杂志》2021年第3期563-573,共11页Journal of Biomedical Engineering
基 金:国家自然科学基金青年项目(61806048);沈阳东软智能医疗科技研究院有限公司开放课题(NRIHTOP1802)。
摘 要:医学文献含有丰富的有价值的医学知识。目前,在医学文献上的实体关系提取研究已经得到了很大的进步,但是随着医学文献数量以指数形式增加,医学文本的标注工作成为一个很大的问题。为解决人工标注耗时长、工作量大的问题,研究者提出了远程监督标注的方法,但这种方法会引入大量噪声。本文提出了一种基于卷积神经网络的新型神经网络结构,可以解决大量噪声问题。该模型可以利用多窗口卷积神经网络自动提取句子特征,在得到句子向量后,通过注意力机制选择对真实关系有效的句子。特别地,提出实体类型(ET)嵌入方法,通过加入实体类型特征用于关系分类。我们针对训练文本存在不可避免的标注错误问题,提出句子级别的注意力机制用于关系提取。使用968份糖尿病医学文献进行实验,结果表明,与基线模型相比,本文模型在医学文献中得到了较好的效果,F1分数达到93.15%。最后,我们将提取的11类关系以三元组的形式存储,并利用这些三元组制成具有33347个节点、43686条关系边的复杂关系医学知识图谱。实验结果证明,本文所使用的算法明显优于用于关系提取的最佳基准系统。The medical literature contains a wealth of valuable medical knowledge.At present,the research on extraction of entity relationship in medical literature has made great progress,but with the exponential increase in the number of medical literature,the annotation of medical text has become a big problem.In order to solve the problem of manual annotation time such as consuming and heavy workload,a remote monitoring annotation method is proposed,but this method will introduce a lot of noise.In this paper,a novel neural network structure based on convolutional neural network is proposed,which can solve a large number of noise problems.The model can use the multi-window convolutional neural network to automatically extract sentence features.After the sentence vectors are obtained,the sentences that are effective to the real relationship are selected through the attention mechanism.In particular,an entity type(ET)embedding method is proposed for relationship classification by adding entity type characteristics.The attention mechanism at sentence level is proposed for relation extraction in allusion to the unavoidable labeling errors in training texts.We conducted an experiment using 968 medical references on diabetes,and the results showed that compared with the baseline model,the present model achieved good results in the medical literature,and F1-score reached 93.15%.Finally,the extracted 11 types of relationships were stored as triples,and these triples were used to create a medical map of complex relationships with 33347 nodes and 43686 relationship edges.Experimental results show that the algorithm used in this paper is superior to the optimal reference system for relationship extraction.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.244