基于并行异构图和序列注意力机制的中文实体关系抽取模型

Chinese entity and relation extraction model based on parallel heterogeneous graph and sequential attention mechanism

作　　者：毛典辉[1,2] 李学博刘峻岭张登辉颜文婧 MAO Dianhui;LI Xuebo;LIU Junling;ZHANG Denghui;YAN Wenjing(Beijing Key Laboratory of Big Data Technology for Food Safety(Beijing Technology and Business University),Beijing 100048,China;National Engineering Laboratory for Agri-Product Quality Traceability(Beijing Technology and Business University),Beijing 100048,China)

机构地区：[1]食品安全大数据技术北京市重点实验室(北京工商大学),北京100048 [2]农产品质量安全追溯技术及应用国家工程实验室(北京工商大学),北京100048

出　　处：《计算机应用》2024年第7期2018-2025,共8页journal of Computer Applications

基　　金：北京市自然科学基金资助项目(9232005);北京市属高校教师队伍建设支持计划项目(BPHR20220104)。

摘　　要：近年来,随着深度学习技术的快速发展,实体关系抽取在许多领域取得了显著的进展。然而,由于汉语具有复杂的句法结构和语义关系,面向中文的实体关系抽取任务中仍然存在着多项挑战。其中,中文文本中的重叠三元组问题是领域中的重要难题之一。针对中文文本中的重叠三元组问题,提出了一种混合神经网络实体关系联合抽取(HNNERJE)模型。HNNERJE模型以并行方式融合序列注意力机制和异构图注意力机制,并结合门控融合策略构建了深度集成框架。该模型不仅可以同时捕获中文文本的语序信息和实体关联信息,还能够自适应地调整主客体标记器的输出,从而有效解决重叠三元组问题。另外,通过引入对抗训练算法提高模型对未见样本和噪声的适应能力。运用SHAP(SHapley Additive exPlanations)方法对HNNERJE模型进行解释分析,基于模型的识别结果解析它在抽取实体和关系时所依据的关键特征。HNNERJE模型在NYT、WebNLG、CMeIE和DuIE数据集上的F1值分别达到了92.17%、93.42%、47.40%和67.98%。实验结果表明:HNNERJE模型可以将非结构化的文本数据转化为结构化的知识表示,有效提取其中蕴含的有价值信息。In recent years,with the rapid development of deep learning technology,entity and relation extraction has made remarkable progress in many fields.However,due to complex syntactic structures and semantic relationships of Chinese text,there are still many challenges in Chinese entity and relation extraction.Among them,the problem of overlapping triple in Chinese text is one of the important challenges.A Hybrid Neural Network Entity and Relation Joint Extraction(HNNERJE)model was proposed in this article to address the issue of overlapping triple in Chinese text.HNNERJE model fused sequence attention mechanism and heterogeneous graph attention mechanism in a parallel manner,and combined them with a gated fusion strategy,so that it could capture both word order information and entity association information of Chinese text,and adaptively adjusted the output of subject and object markers,effectively solving the overlapping triple issue.Moreover,adversarial training algorithm was introduced to improve the model’s adaptability in processing unseen samples and noise.Finally,SHapley Additive exPlanations(SHAP)method was adopted to explain and analyze HNNERJE model,which effectively revealed key features in extracting entities and relations.HNNERJE model achieved high performance on NYT,WebNLG,CMeIE,and DuIE datasets with F1 score of 92.17%,93.42%,47.40%,and 67.98%,respectively.The experimental results indicate that HNNERJE model can transform unstructured text data into structured knowledge representations and effectively extract valuable information.

关键词：实体关系抽取异构图注意力机制对抗训练 SHAP方法

分类号：TP391.1[自动化与计算机技术—计算机应用技术] R5[自动化与计算机技术—计算机科学与技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于并行异构图和序列注意力机制的中文实体关系抽取模型

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于并行异构图和序列注意力机制的中文实体关系抽取模型

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索