应用双曲空间特征融合的姓名消歧方法研究  

Name disambiguation method based on hyperbolic space feature fusion

在线阅读下载全文

作  者:武南南 郭泽浩 赵一鸣 甄紫旭 王文俊[1] 柳研 WU Nannan;GUO Zehao;ZHAO Yiming;ZHEN Zixu;WANG Wenjun;LIU Yan(College of Intelligence and Computing,Tianjin University,Tianjin 300354,China;School of Computer Science and Technology,Anhui University,Hefei 230039,China)

机构地区:[1]天津大学智能与计算学部,天津300354 [2]安徽大学计算机科学与技术学院,安徽合肥230039

出  处:《智能系统学报》2024年第1期79-88,共10页CAAI Transactions on Intelligent Systems

基  金:青海省重点研发与转化计划项目(2022-QY-218)。

摘  要:针对传统用户影响力分析等研究遇到姓名重名的挑战,姓名歧义的影响日益严重的问题,本文基于双曲空间结合欧氏空间进行特征融合,提出融合多空间特征的网络对齐方法(geometry interaction network alignment,GINA),有效建模网络结构对用户姓名消歧的主要作用。本文同时使用欧氏空间和双曲空间进行网络表示学习,以获取具有不同空间特点的网络结构信息,使用跨空间网络映射及跨空间特征融合在尽量减少空间映射损失的情况下实现不同空间的信息交互得到最终的网络表示,进行网络对齐,进而实现姓名消歧。本文在中文论文合作网络、英文论文合作网络和中文专利合作网络上,两两对齐构建论文-专利实证数据网络对齐数据集和中文-英文实证数据网络对齐数据集,开展GINA模型在网络对齐数据集上对重名人员身份识别和中外论文身份识别2个实证场景试验验证,双曲空间融合欧氏空间相比单一空间精确率增加了24.9%,验证了GINA方法的有效性。In view of the challenge of name duplication and the increasingly serious influence of name ambiguity in traditional user influence analysis and other research,the impact of name ambiguity is becoming increasingly serious.This paper proposes a network alignment model–geometry interaction network alignment(GINA)based on the fusion of hyperbolic space and Euclidean space features,fusing multiple spatial features.It effectively establishes a model to show the main function of a network structure for name disambiguation.The fundamental idea of this paper is to simultaneously utilize both Euclidean space and hyperbolic space for network representation learning,aiming to capture network structural information with distinct spatial characteristics.It employs cross-space network mapping and cross-space feature fusion to realize information exchange among different spaces and final network representation under the situations of reducing loss of spatial mapping as far as possible,implements network alignment and further name disambiguation.By performing network alignment based on the obtained representations,the paper accomplishes name disambiguation.On real datasets,the Chinese paper co-authorship network,English paper co-authorship network,and the Chinese patent co-authorship network are aligned in pair to construct the"Paper-Patent"empirical data network alignment dataset and the"Chinese-English"empirical data network alignment dataset to carry out the test demonstration of GINA model in two empirical scenarios for the identity recognition of the individuals with the same name and Chinese&foreign papers.The results show that the precision in the hyperbolic space combined with the Euclidean space is at least 24.9%higher than that in a single space,confirming effectiveness of the GINA method.

关 键 词:姓名消歧 欧氏空间 双曲空间 网络对齐 网络表示学习 图嵌入 特征融合 锚链接预测 

分 类 号:TP39[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象