检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:贵向泉[1] 郭亮 李立[1] GUI Xiang-quan;GUO Liang;LI Li(School of Computer and Communication,Lanzhou University of Technology,Lanzhou 730050,China)
机构地区:[1]兰州理工大学计算机与通信学院,甘肃兰州730050
出 处:《计算机技术与发展》2023年第10期93-100,共8页Computer Technology and Development
基 金:国家重点研发计划(2020YFB1713600)。
摘 要:命名实体是构建产业企业画像和产业知识图谱的重要依据,为解决现有方法在有色冶金领域命名实体识别任务当中无法充分提取文本语义特征、没有充分利用标签当中的先验知识和嵌套命名实体识别效果不佳的问题,提出了一种基于机器阅读理解框架(MRC)和知识增强语义表示模型(ERNIE)的MEAB(MRC-ERNIE-Attention-BiLSTM)模型结构。该模型在MRC框架的基础上,引入了基于Attention的信息融合策略,将两种不同结构的数据在ERNIE预训练模型进行特征提取之后转换为向量,并在信息融合层进行向量融合,使模型能够学习到标签当中的先验知识。随后BiLSTM模型对具有语义信息的向量从两个方向进行特征提取,并在一种多层嵌套命名实体识别器中进行输出,提高了嵌套命名实体的识别准确率。在构建的有色冶金领域命名实体识别数据集上的实验表明,MEAB模型的精确率、召回率和F1值分别达到了78.77%、79.76%和79.26%,证明了该模型的有效性。Named entities are an important basis for building industrial enterprise portraits and industrial knowledge maps.To solve the problems that existing methods cannot fully extract text semantic features,do not make full use of prior knowledge in labels,and do not perform well in nested named entity recognition tasks in nonferrous metallurgy industry,we propose a MEAB(MRC-ERNIE-Attention-BiLSTM)model structure based on Machine Reading Comprehension(MRC)and Enhanced Representation Through Knowledge Integration(ERNIE).On the basis of MRC,the information fusion strategy is introduced to convert the data of two different structures into vectors after feature extraction in the ERNIE pre training model,and carry out vector fusion at the information fusion level,so that the model can learn the prior knowledge in the tag.Then the BiLSTM model extracts the features of vectors with semantic information from two directions and outputs them in a multi-layer nested named entity recognizer to improve the recognition accuracy of nested named entities.Experiments on the data set of named entity recognition in the field of nonferrous metallurgy industry show that the accuracy,recall and F1 value of MEAB model reach 78.77%,79.76%and 79.26%respectively,which proves the effectiveness of the model.
关 键 词:有色冶金产业 自然语言处理 命名实体识别 MRC ERNIE
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.119.143.52