检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:陈晶 孙亚轩 邢珂萱 CHEN Jing;SUN Yaxua;XING Kexuan(School of Electronics and Information Engineering,Guangdong Ocean University,Zhanjiang 524088;School of Information Science and Engineering,Yanshan University,Qinhuangdao 066004;Key Laboratory of Virtual Technology and System Integration,Yanshan University,Qinhuangdao 066004)
机构地区:[1]广东海洋大学数学与计算机学院,湛江524088 [2]燕山大学信息科学与工程学院,秦皇岛066004 [3]河北省虚拟技术与系统集成重点实验室,秦皇岛066004
出 处:《高技术通讯》2024年第10期1058-1069,共12页Chinese High Technology Letters
基 金:国家自然科学基金(62172352,61871465,42306218);中央政府引导地方科技发展基金(226Z0102G,226Z0305G);河北省自然科学基金(2022203028);广东海洋大学科研启动基金(060302102304)资助项目。
摘 要:医学领域文本存在大量的专业词汇,相比于通用领域更容易出现分词错误和未登录词的问题,其结果会导致上下文语义缺失,并影响命名实体识别(NER)的准确率。为了解决上述问题,本文提出了引入词汇信息的基于门控循环单元的中文医学命名实体识别模型WI-NER。首先,基于中文医学数据集的特点,描述了中文医学领域的命名实体识别的任务定义、实体位置和实体类别标签,并将模型在嵌入层对匹配专业词的字符进行特征嵌入与向量融合;其次,在上下文编码层添加词汇门控单元,利用循环神经网络的记忆与遗忘机制,自动提取实体识别所需的特征,并通过引入词汇信息和先验知识,实现了中文医学命名实体识别效果的提升;最后,对本模型在3个数据集上进行了实验验证,结果表明,本文提出的中文医学命名实体识别模型在准确率方面优于基线模型,达到了预期的医学领域特性。There are a large number of specialized words in medical texts,which are more prone to word segmentation errors and unregistered words than in general fields,resulting in the loss of contextual semantics and affecting the accuracy of named entity recognition(NER).In order to solve the above problems,WI-NER,a Chinese medical named entity recognition model based on gated circulation unit with lexical information,is proposed in this paper.Firstly,on the basis of the characteristics of Chinese medical data set,the task definition,entity location and entity category label of named entity recognition in Chinese medical field are described.In addition,the model performs feature embedding and vector fusion on the characters matching professional words in the embedding layer.Secondly,a lexical gating unit is added to the context coding layer,and the features required for entity recognition are automatically extracted by using the memory and forgetting mechanism of recurrent neural networks.By introducing lexical information and prior knowledge,the recognition effect of Chinese medical named entities is improved.Finally,the model is verified by experiments on three datasets,and the results show that the accuracy of the Chinese medical named entity recognition model proposed in this paper is better than that of the baseline model,achieving the expected characteristics in the medical field.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.249