检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:朱文龙 江嘉治 黄聃 肖侬 ZHU Wen-long;JIANG Jia-zhi;HUANG Dan;XIAO Nong(School of Computer Science and Engineering,Sun Yat-sen University,Guangzhou 510006,China)
出 处:《计算机工程与科学》2023年第9期1521-1531,共11页Computer Engineering & Science
基 金:国家重点研发计划(2021YFB0301300);国家自然科学基金(U1811461);广东省基础与应用基础研究基金(2019B030302002);广东省引进创新创业团队(2016ZT06D211);广东省重点领域研发计划(2021B0101190003);之江实验室项目(2021KC0AB04)。
摘 要:随着算力需求的增长,各种国产异构计算设备不断出现,这些设备都有其专用的编程模型,开发者需要根据不同设备的架构特点在专用的编程模型上进行开发,导致开发出的代码在设备间不具有可移植性。近年来国外已经出现了支持多种计算设备的统一异构并行编程模型,但针对国产设备的异构编程模型的研究和实现还比较少。针对该问题,开发了一套性能可移植的异构编程模型ParM。该编程模型以C++库的形式提供,屏蔽了大量的底层实现细节,降低了并行编程难度。该编程框架目前支持的后端设备有x86 CPU、NVIDIA GPU、华为鲲鹏处理器和华为昇腾AI处理器,并且对各种后端设备进行了性能优化。在各种设备上的性能测试表明,ParM编程模型的性能可以达到原始代码的90%以上。With the increasing demand for computing power,various domestically produced heterogeneous computing devices have emerged.These devices have their specialized programming models,and developers need to develop based on the architecture characteristics of different devices using these dedicated programming models.Therefore,the code developed is not portable across devices.In recent years,unified heterogeneous parallel programming models that support various computing devices have appeared overseas,but there is still relatively little research and implementation of heterogeneous programming models for domestically produced devices.To address this issue,a performance-portable heterogeneous programming model called ParM has been developed.This programming model is provided in the form of a C++library and shields many low-level implementation details,reducing the difficulty of parallel programming.The current backend devices supported by this programming framework include x86 CPUs,NVIDIA GPUs,Huawei Kunpeng processors,and Huawei Ascend AI processors.Performance optimizations have been carried out for these backend devices,and performance test on various devices has shown that the ParM programming model can achieve over 90%performance compared to native code.
关 键 词:性能可移植 并行编程模型 高性能计算 异构计算 国产处理器
分 类 号:TP302.1[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.135.209.180