检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:毛润泽 吴子恒 徐嘉阳 章严 陈帜 MAO Run-ze;WU Zi-heng;XU Jia-yang;ZHANG Yan;CHEN Zhi(College of Engineering,Peking University,Beijing 100871;AI for Science Institute,Beijing 100083;CAEP Software Center for High Performance Numerical Simulation,Beijing 100088,China)
机构地区:[1]北京大学工学院,北京100871 [2]北京科学智能研究院,北京100083 [3]中物院高性能数值模拟软件中心,北京100088
出 处:《计算机工程与科学》2024年第11期1901-1907,共7页Computer Engineering & Science
基 金:国家自然科学基金(52276096,92270203,523B2062);光合基金C(202302032372)。
摘 要:近年来,深度学习被广泛认为是加速反应流模拟的一种可靠方法。近期开发了一个名为DeepFlame的开源平台,可以在模拟反应流过程中实现对机器学习库和算法的支持。基于DeepFlame,成功地采用深度神经网络来计算化学反应源项,并对DeepFlame平台进行了高性能优化。首先,为了充分发挥深度神经网络(DNN)的加速潜力,研究实现了DeepFlame对DNN多卡并行推理的支持,开发了节点内分割算法和主从通信结构,并完成了DeepFlame向图形处理单元(GPU)和深度计算单元(DCU)的移植。其次,还基于Nvidia AmgX库在GPU上实现了偏微分方程求解和离散稀疏矩阵构造。最后,对CPU-GPU/DCU异构架构上的新版本DeepFlame的计算性能进行了评估。结果表明,仅利用单个GPU卡,在模拟具有反应性的泰勒格林涡(TGV)时可以实现的最大加速比达到15。In recent years,deep learning has been widely recognized as a reliable approach to accele-rate reacting flow simulations.In recent work,this paper has developed an open-source platform named DeepFlame,which supports machine learning libraries and algorithms during the simulation of reacting flows.Leveraging DeepFlame,this paper has successfully employed deep neural networks(DNNs)to compute chemical reaction source terms.This paper focus on optimizing the platform for high-performance.Firstly,to fully harness the acceleration potential of DNNs this paper implements support for multi-GPU parallel inference in DeepFlame,developing intra-node partitioning algorithms and a master-slave communication structure,and complete the migration to Graphics Processing Units(GPUs)and Deep Computing Units(DCUs).Furthermore,this paper implements the solution of partial differential equations and the construction of discrete sparse matrices on GPUs based on the Nvidia AmgX library.Finally,this paper evaluates the computational performance of the updated DeepFlame on a CPU-GPU/DCU heterogeneous architecture.The results indicate that using a single GPU card alone can achieve a maximum speedup of up to 15 times when simulating a reactive Taylor Green Vortex(TGV).
关 键 词:计算流体力学 反应流动 深度神经网络 GPU 偏微分方程
分 类 号:TP391.9[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.90