检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:赵蕊洁 佟昕瑀 刘小桦 路永和[1] Zhao Ruijie;Tong Xinyu;Liu Xiaohua;Lu Yonghe(School of Information Management,Sun Yat-Sen University,Guangzhou 510006,China)
出 处:《数据分析与知识发现》2022年第9期100-112,共13页Data Analysis and Knowledge Discovery
基 金:广州市科技计划基金项目(项目编号:202002020036)的研究成果之一。
摘 要:【目的】为提高医药实体识别的效果、实现医药新知识的挖掘和提高医药科技论文的利用率,提出一种新的实体识别模型。【方法】构建基于Attention-BiLSTM-CRF的医药实体识别模型,在公开数据集GENIA Term Annotation Task和BioCreative Ⅱ Gene Mention Tagging上分别对模型进行测试,进而使用该模型对生物医药论文的摘要进行实体标注。【结果】本文提出的模型优于其他基准模型,在两个数据集上的F1值分别为81.57%和84.23%、准确率分别为92.51%和97.85%,并且在数据不平衡的情况下更有优势。【局限】实体标注实验数据量和应用范围较为单一。【结论】基于Attention-BiLSTM-CRF的医药实体识别模型可以提高实体识别效果并实现医药新知识的挖掘。[Objective] This paper proposes a new entity recognition model, aiming to find new knowledge effectively and improve the utilization of medical papers. [Methods] We constructed a pharmaceutical entity recognition model based on Attention-BiLSTM-CRF and examined it on the public datasets of GENIA Term Annotation Task and BioCreative II Gene Mention Tagging. We also used the model to annotate abstracts of biomedical scientific papers. [Results] The F1 values of our model on the two data sets were 81.57% and 84.23%,while the accuracy rates were 92.51% and 97.85%. These results are better than those of the benchmark ones.Moreover, our model has more advantages in processing the extremely unbalanced data. [Limitations] The volume of data and application of entity labeling experiments are relatively homogeneous. [Conclusions] The proposed model improves the effectiveness of entity recognition and mining of new medical knowledge.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.44