检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:罗海文 吴扬俊 商红慧 LUO Haiwen;WU Yangjun;SHANG Honghui(State Key Laboratory of Processors,Institute of Computing Technology,Chinese Acadamy of Science,Beijing 100190,China)
机构地区:[1]中国科学院计算技术研究所处理器芯片全国重点实验室,北京100190
出 处:《计算机科学》2023年第6期1-9,共9页Computer Science
基 金:国家重点研发计划(2020YFB1709500);国家自然科学基金(22003073)。
摘 要:基于量子力学的密度泛函微扰理论(DFPT)可以用来计算分子和材料的多种物理化学性质,目前被广泛应用于新材料等领域的研究中;同时,异构众核处理器架构逐渐成为超算的主流。因此,针对异构众核处理器重新设计和优化DFPT程序以提升其计算效率,对物理化学性质的计算及其科学应用具有重要意义。文中对DFPT中一阶响应密度和一阶响应哈密顿矩阵的计算针对众核处理器体系结构进行了优化,并在新一代神威处理器上进行了验证。优化技术包括循环分块、离散访存处理和协同规约。其中,循环分块对任务进行划分从而由众核并行地执行;离散访存处理将离散访存转换为更高效的连续访存;协同规约解决了写冲突问题。实验结果表明,在一个核组上,优化后的程序性能较优化前提高了8.2~74.4倍,并且具有良好的强可扩展性和弱可扩展性。Density-functional perturbation theory(DFPT)based on quantum mechanics can be used to calculate a variety of physicochemical properties of molecules and materials and is now widely used in the research of new materials.Meanwhile,heteroge-neous many-core processor architectures are becoming the mainstream of supercomputing.Therefore,redesigning and optimizing DFPT programs for heterogeneous many-core processors to improve their computational efficiency is of great importance for the computation of physicochemical properties and their scientific applications.In this work,the computation of first-order response density and first-order response Hamiltonian matrix in DFPT is optimized for many-core processor architecture and verified on the new generation Sunway processors.Optimization techniques include loop tiling,discrete memory access processing and colla-borative reduction.Among them,loop tiling divides tasks so that they can be executed by many cores in parallel;discrete memory access processing converts discrete accesses into more efficient continuous memory accesses;collaborative reduction solves the write conflict problem.Experimental results show that the performance of the optimized program improves by 8.2 to 74.4 times over the pre-optimization program on one core group,and has good strong scalability and weak scalability.
关 键 词:密度函数微扰理论 第一性原理计算 高性能计算 新一代神威异构众核处理器
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.171