大语言模型驱动的知识图谱实体摘要的次模优化方法

Submodular Optimization Approach for Entity Summarization in Knowledge Graph Driven by Large Language Models

作　　者：张琪钟昊 ZHANG Qi;ZHONG Hao(School of Information Technology and Engineering,Guangzhou College of Commerce,Guangzhou 511363,China;School of Computer Science,South China Normal University,Guangzhou 510631,China)

机构地区：[1]广州商学院信息技术与工程学院,广州511363 [2]华南师范大学计算机学院,广州510631

出　　处：《计算机科学与探索》2024年第7期1806-1813,共8页Journal of Frontiers of Computer Science and Technology

基　　金：国家重点研发计划(2023YFC3341200);国家自然科学基金(62377015);华南师范大学青年教师科研培育基金项目(23KJ29)。

摘　　要：知识图谱的规模不断增加,使得实体摘要成为了研究的热点问题。实体摘要的目标是从描述实体的大规模三元结构事实中得到实体的简洁描述。研究的目的是基于大语言模型提出一种次模优化方法用于实体摘要的提取。首先,基于三元组中实体、关系和属性的描述信息,采用大语言模型对它们进行嵌入,能够有效地捕捉三元组的语义信息,生成包含丰富语义信息的嵌入向量。其次,基于大语言模型生成的嵌入向量,定义任意两个描述同一实体的三元组事实之间关联度的刻画方法,任意两个三元组之间的关联度越高,表示这两个三元组之间包含的信息越相似。最后,基于上述定义的三元组关联度的刻画方法,定义正规化且单调非减的次模函数,将实体摘要建模为次模函数最大化问题,那么具有性能保证的贪心算法可以直接用于提取实体的摘要。在三个公共基准数据集上进行测试,采用F1值和归一化折损累计增益(NDCG)两个指标对提取的实体摘要的质量进行评估,实验结果表明该方法显著优于当前最先进的方法。The continuous expansion of the knowledge graph has made entity summarization a research hotspot.The goal of entity summarization is to obtain a brief description of an entity from large-scale triple-structured facts that describe it.The research aims to propose a submodular optimization method for entity summarization based on a large language model.Firstly,based on the descriptive information of entities,relationships,and properties in the triples,a large language model is used to embed them to vectors,effectively capturing the semantic information of the triples and generating embedding vectors containing rich semantic information.Secondly,based on the embedding vectors generated by the large language model,a method is defined to characterize the relevance between any two triples that describe the same entity.The higher the relevance between any two triples,the more similar the information contained in these two triples.Finally,based on the defined method for characterizing triple relevance,a normalized and monotonically non-decreasing submodular function is defined,modeling entity summarization as a submodular function maximization problem.Therefore,greedy algorithms with performance guarantees can be directly applied to extracting entity summaries.Testing is conducted on three public benchmark datasets,and the quality of the extracted entity summaries is evaluated using two metrics,F1 score and NDCG(normalized discounted cumulative gain).Experimental results show that the proposed approach significantly outperforms the state-of-the-art method.

关键词：实体摘要大语言模型次模函数贪心算法

分类号：TP301[自动化与计算机技术—计算机系统结构]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

大语言模型驱动的知识图谱实体摘要的次模优化方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

大语言模型驱动的知识图谱实体摘要的次模优化方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索