面向CPU-GPU异构环境下的高性能超图神经网络加速系统

An efficient high throughput system of hypergraph neural network under CPU-GPU heterogeneous environments

作　　者：余辉[1,2,3,4] 张宇李鑫滔[1,2,3,4] 陈子康赵英淇[1,2,3,4] 赵进齐豪廖小飞金海 Hui YU;Yu ZHANG;Xintao LI;Zikang CHEN;Yingqi ZHAO;Jin ZHAO;Hao QI;Xiaofei LIAO;Hai JIN(National Engineering Research Center for Big Data Technology and System,Huazhong University of Science and Technology,Wuhan 430074,China;Service Computing Technology and System Lab,Huazhong University of Science and Technology,Wuhan 430074,China;Cluster and Grid Computing Lab,Huazhong University of Science and Technology,Wuhan 430074,China;School of Computer Science and Technology,Huazhong University of Science and Technology,Wuhan 430074,China)

机构地区：[1]华中科技大学大数据技术与系统国家地方联合工程研究中心,武汉430074 [2]华中科技大学服务计算技术与系统教育部重点实验室,武汉430074 [3]华中科技大学集群与网格计算湖北省重点实验室,武汉430074 [4]华中科技大学计算机科学与技术学院,武汉430074

出　　处：《中国科学:信息科学》2025年第4期841-859,共19页Scientia Sinica(Informationis)

基　　金：湖北省重点研发计划(批准号:2023BAB078);国家重点研发计划(批准号:2024YFB4504200);国家自然科学基金(批准号:62472183);CCF-蚂蚁科研基金(批准号:CCF-AFSG RF20240204)资助项目。

摘　　要：图神经网络因其在处理非欧几里得空间数据方面的卓越学习和推理能力而备受关注.然而,现实世界中的图结构通常包含顶点之间复杂的高阶关联关系,这类图通常被定义为超图.为了在超图上有效捕获这些高阶复杂特征和潜在语义信息,众多超图神经网络模型被相继提出.尽管目前已有多个利用GPU加速器的高并行计算能力来加速超图神经网络训练的系统软件,但在加速模型训练过程中,这些系统仍面临大量冗余计算和频繁的数据通信问题,导致GPU利用率较低.此外,这些系统仅支持图规模较小的超图神经网络模型训练.针对上述问题,本文发现超图神经网络训练过程中,由于超图中多条超边之间存在较强的拓扑重叠特性,因此这些超边的计算会重复获取和计算相同图顶点的特征向量.基于这一观察,本文提出了一个面向CPU-GPU异构环境的高性能超图神经网络系统(redundancy-elimination hypergraph neural network,RHGNN).该系统充分利用超边间的重叠特性指导超边和顶点特征计算与更新,有效减少CPU-GPU数据通信量.具体而言,RHGNN提出了一种新颖的以超边为中心的冗余消除执行方法,使系统以最优方式完成顶点特征向量的通信加载,实现一次加载服务多次超边和顶点特征计算.同时,RHGNN通过高效的超度感知层次数据缓存机制,在GPU端优先缓存频繁访问的顶点和超边特征向量以及超边中共享顶点的中间结果值,进一步减少数据通信开销.为验证RHGNN的有效性,本文将其与当前最先进的超图神经网络软件系统DGL,PyG以及HyperGef进行了性能对比.实验结果表明,在支持超图神经网络模型训练方面,RHGNN相较于PyG,DGL和HyperGef的性能分别提升了2.5~3.4倍,2.1~3.1倍和1.4~2.3倍.In recent years,graph neural networks(GNNs)have gained significant attention for their exceptional ability to learn and reason with non-Euclidean data.However,real-world graph structures often involve complex higher-order relationships between vertices,typically represented as hypergraphs.To effectively capture these intricate features and underlying semantic information in hypergraphs,numerous hypergraph neural network(HGNN)models have been proposed.Despite the development of various software systems that leverage the high parallel computing power of GPUs to accelerate HGNN training,these systems still face challenges such as excessive redundant computation and frequent data communication,leading to suboptimal GPU utilization.Moreover,these systems are limited to training HGNN models on relatively small-scale graphs.To address these issues,we observed that multiple hyperedge computations often repeatedly access and compute the identical vertex feature vectors due to significant topological overlap between hyperedges in hypergraphs when handling the HyperGNN models.Based on this observation,we propose RHGNN,a redundancy-elimination HyperGNN training system tailored for CPU-GPU heterogeneous environments.RHGNN exploits the overlapping characteristics of hyperedges to optimize the computation and updating of hyperedge and vertex features,significantly reducing CPU-GPU data communication overhead.Specifically,RHGNN presents a novel hyperedgecentric redundancy elimination approach,enabling the system to load the feature vectors of vertices to optimize communication,allowing a single load operation to serve multiple hyperedge and vertex feature computations.In addition,RHGNN implements an efficient hyperdegree-aware hierarchy caching mechanism that prioritizes frequently accessed vertex feature vectors and intermediate results on the GPU,further reducing communication costs.To validate the effectiveness of RHGNN,we compared its performance against the state-of-the-art HGNN software systems DGL,PyG,and HyperGef.Experim

关键词：超图神经网络训练加速 CPU-GPU异构冗余消除拓扑相似性

分类号：TP183[自动化与计算机技术—控制理论与控制工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

面向CPU-GPU异构环境下的高性能超图神经网络加速系统

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

面向CPU-GPU异构环境下的高性能超图神经网络加速系统

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索