检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:杨周凡 韩林 李冰洋 谢景明 韩璞 刘勇杰 YANG Zhoufan;HAN Lin;LI Bingyang;XIE Jingming;HAN Pu;LIU Yongjie(Information Engineering Institute,Zhengzhou University,Zhengzhou 450000,China;National Supercomputing Zhengzhou Center,Zhengzhou University,Zhengzhou 450000,China)
机构地区:[1]郑州大学信息工程学院,郑州450000 [2]郑州大学国家超级计算郑州中心,郑州450000
出 处:《计算机工程》2022年第9期155-161,共7页Computer Engineering
基 金:河南省重大科技专项(201400210800,201400210700)。
摘 要:供水管网仿真广泛应用于城市供水输配调度,是城市供水管网监测与维护的重要技术手段。由于在面向城市级的大规模管网中产生了海量的计算数据,因此在一般计算平台上无法满足管网仿真计算的算力需求。为提升城市级供水管网仿真的计算效率,提出一种有效的并行化方案。基于“嵩山”超级计算机系统采用中央处理器+数据缓存单元(CPU+DCU)架构,利用其在密集数据计算方面的优势,对“嵩山”超级计算机进行供水管网仿真。参照可移植性异构计算接口(HIP)异构编程模型,在“嵩山”超级计算机上实现供水管网仿真的异构计算,并结合管道数据分割方案,使用消息传递接口开启多进程以实现DCU加速数据通信传递。通过重定义数据类型解决计算过程中结构体传输问题,实现单节点内多DCU的大规模密集计算。在不同计算平台和多种计算策略仿真上的对比结果表明,与传统x86平台相比,该优化方案在小规模数据与大规模数据上的加速比分别达到5.269、10.760,与采用计算统一设备架构异构编程模型的传统GPU异构平台相比,计算性能有明显提高。Water supply pipeline network simulation is widely performed for urban water supply transmission and distribution scheduling. Moreover,it is an important technical approach for monitoring and maintaining urban water supply networks. The amount of calculation data generated in the city-level large-scale pipeline network is significant;therefore,they are not available on general computing platforms. To satisfy computing power requirements for performing pipeline network calculations through simulation and to improve the efficiency of the calculations,this paper proposes an efficient parallelization scheme and uses the Central Processing Unit+Data Cache Unit(CPU+DCU)architecture based on the“Songshan”supercomputer system to exploit its advantages in the calculation of dense data to simulate the water supply network.Subsequently,based on the Heterogeneous-computing Interface for Portability(HIP)heterogeneous programming model,a heterogeneous calculation of the water supply pipe network is implemented numerically on the“Songshan”supercomputer. Using the existing pipeline data segmentation scheme,the Message Passing Interface(MPI) is used to launch multiple processes to allow the DCU to accelerate the transmission of data communication. The data type is defined,the structure transmission problem encountered during the calculation is solved,and the large-scale intensive calculation of multiple DCU in a single node is realized.A comparison between the simulation results of different computing platforms and multiple computing strategies show that the speedup of the optimization scheme and implementation method proposed herein is 5.269 and 10.760 for small-and large-scale data,respectively,which are better than that of the classical x86 platform. Additionally,by implementing Compute Unified Device Architecture(CUDA)heterogeneous programming,the computing performance achieved is better than that of a model using the classical GPU heterogeneous platform.
关 键 词:中央处理器+数据缓存单元架构 数据缓存单元加速器 仿真计算 可移植性异构计算接口 消息传递接口
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.17.74.222