知识库中标注词句序列命名实体识别方法

A Method for Identifying Named Entities in Annotated Word And Sentence Sequences in a Knowledge Base

作　　者：郭龙梁灿李彦丽 GUO Long;LIANG Can;LI Yan-li(CNO0C Information Technology Co.,Ltd.,Haikou Hainan 570100,China;School of Civil and Architectural Engineering,Changzhou Institute of Technology,Changzhou Jiangsu 213032,China;State Key Laboratory of Oil and Gas Resources and Engineering,China University of Petroleum(Beijing),Beijing 102249,China;Hainan Branch of CN00C(China)Co.,Ltd,Haikou Hainan 570100,China)

机构地区：[1]中海油信息科技有限公司湛江分公司,海南海口570100 [2]常州工学院土木建筑工程学院,江苏常州213032 [3]中国石油大学(北京)油气资源与工程全国重点实验室,北京102249 [4]中海石油(中国)有限公司海南分公司,海南海口570100

出　　处：《计算机仿真》2024年第11期512-516,共5页Computer Simulation

基　　金：国家自然科学基金(42004105);油气资源与工程全国重点实验室开放课题(PRE/open-2304)。

摘　　要：网络文本数据具有信息类型多样性、数据规模庞大性及形式多变性等特点,应用传统数据序列命名实体识别方法难以对知识库文本数据精准挖掘,易存在文本数据信息丢失的问题,实体信息识别效果不佳。为解决上述问题,提出了一种基于图神经网络的知识库中命名实体识别方法研究。方法采用词句融合方式表征文本信息,以避免命名实体识别中文本信息中的词句丢失。然后通过遗忘门、sigmoid函数清除无关或相关性小的词句信息,保留相关性较大的信息,基于tanh函数、记忆细胞单元更新词句信息,利用图神经网络挖掘词、句间的特征及关联关系,采用条件随机场、最大化似然函数标注词句序列,确定命名实体内容。最后,应用实验验证所提方法的先进性。实验结果表明,所提方法显著提升了命名实体的识别准度,且收敛速度快,应用效果较好。Network text data has the characteristics of information type diversity,data scale and form variability.It is difficult to accurately mine the text data of the knowledge base,which is difficult to accurately mine the text data,which is easy to lose the problem of text data information,and the identification effect of entity information is not good.In order to solve this problem,a method of named entity recognition in the knowledge base based on graph neural network is proposed.This method uses word sentence fusion to represent the text information to avoid the loss of word and sentence information in named entity recognition.Then it removes irrelevant or less relevant word and sentence information through forgetting gate and sigmoid function,retains large relevant information,updates word and sentence information based on tanh function and memory cell unit,uses graph neural network to mine the characteristics and correlation between words and sentences,uses conditional random field and maximum likelihood function to label word and sentence sequences,and determines the content of named entities.Finally,the experiment proves the advancement of the proposed method.Experimental results show that the proposed method significantly improves the recognition accuracy of named entities,and has a fast convergence speed and good application effect.

关键词：图神经网络知识库字词向量命名实体长短期记忆网络

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

知识库中标注词句序列命名实体识别方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

知识库中标注词句序列命名实体识别方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索