基于语义相关性的命名实体识别算法研究  被引量:1

Named Entity Recognition Algorithm Based on Semantic Relevance

在线阅读下载全文

作  者:袁运新 樊腾飞 聂为之 YUAN Yunxin;FAN Tengfei;NIE Weizhi(School of Electrical and Information Engineering,Tianjin University,Tianjin 300072,China;Tianjin International Engineering Institute,Tianjin University,Tianjin 300072,China)

机构地区:[1]天津大学电气自动化与信息工程学院,天津300072 [2]天津大学国际工程师学院,天津300072

出  处:《中文信息学报》2023年第10期34-44,共11页Journal of Chinese Information Processing

基  金:国家重点研究与发展计划(2020YFB1711704);国家自然科学基金(62272337)。

摘  要:高质量的命名实体识别算法往往依赖海量的高质量标注数据来帮助实体识别模型的训练,然而大规模标注数据的获取存在诸多困难,如何通过文本信息自身的相关性来提高实体识别的准确性受到越来越多科研工作者的关注。该文有效地利用文本信息的语义相关性引入“实体联合器”概念,通过其与实体的高相关性,提高实体的数字化表征能力,进而实现对实体的有效识别。基于此,首先提出了一种实体联合器识别模型,通过文本关联结构信息来实现非标签文本的实体联合器识别;之后,采用经典的BiLSTM网络模型,提取句子的语义表征,并利用特征融合机制实现实体联合器与句子特征融合;由于实体联合器与实体有较强的关联性,又提出了针对实体表征及句子整体表征的约束机制,确保实体联合器在特征学习过程中的指导作用,精准高效地识别文本数据中的实体。通过在公开的数据集CoNLL03、NCBI Disease上对该文算法进行测试,相关实验结果证明了该文所提出算法的优越性和合理性。High-quality named entity recognition algorithms tend to rely on massive amounts of high-quality annotated data.However,there are many difficulties in obtaining large-scale annotated data.Therefore,more and more researchers pay attention to how to improve the accuracy of entity recognition through the relevance of text information.The concept of"entity combiner"is introduced to improve the entity's digital representation ability through its high relevance with entities.Then,the entity combiner recognition model is proposed to identify the entity combiner in the unlabeled text.The classical BILSTM(Bi-directional Long Short-Term Memory)network model is used to extract the semantic representation in sentences.Moreover,the feature-fused mechanism is implemented to combine the entity combiner and sentence feature.Due to the strong correlation between entity combiner and entity,the constraint mechanism for entity representation and sentence representation is proposed to ensure the function of entity combiner in the feature learning process.Experiments on CoNLL03 and NCBI Disease datasets demonstrate the superiority and effectiveness of the proposed method.

关 键 词:命名实体识别 语义相关性 实体联合器 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象