检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:杨梅芳 车永刚[1,2] 高翔 Yang Meifang;Che Yonggang;Gao Xiang(College of Computer,National University of Defense Technology,Changsha 410073;Science and Technology on Parallel and Distributed Processing Laboratory,National University of Defense Technology,Changsha 410073)
机构地区:[1]国防科技大学计算机学院长沙410073 [2]国防科技大学并行与分布处理重点实验室,长沙410073
出 处:《计算机研究与发展》2018年第2期400-408,共9页Journal of Computer Research and Development
基 金:国家自然科学基金国际合作与交流项目(61561146395);国家自然科学基金项目(11502296);国家“八六三”高技术研究发展计划基金项目(2012AA01A301)
摘 要:LESAP是一个超燃冲压发动机燃烧数值模拟软件,可模拟发动机燃烧室内的燃烧化学反应与超声速流动,具有实际工程应用价值,其计算量巨大.面向通用CPU与Intel集成众核协处理器(many integrated core,MIC)构成的新型异构众核平台,使用新的OpenMP 4.0编程标准,实现了LESAP软件面向异构并行平台的移植,并采用SIMD向量化、数据传输优化、基于网格块划分的负载均衡等技术进行了性能优化.性能测试结果表明异构版本比纯CPU版本性能更佳.在天河二号超级计算机的1个结点(含2个12核的Intel Xeon E5-2692CPU加3块Intel Xeon Phi 31S1P协处理器)上,对一个实际超燃发动机燃烧数值模拟问题,网格规模为532万单元时,每时间步的平均执行时间从原来纯CPU版的64.72s减少到21.06s,性能加速比达到约3.07.LESAP is a combustion simulation application capable of simulating the chemical reactions and supersonic flows in the scramjet engines.It can be used to solve practical engineering problems and involve a large amount of computations.In this paper,we port and optimize LESAP with the OpenMP 4.0 accelerator model,targeting the heterogeneous many-core platform composed of general CPU and Intel Many Integrated Core(MIC).Based on the application characteristics,a series of techniques are proposed,including OpenMP 4.0 based task offloading,data movement optimization,grid-partition based load-balancing and SIMD optimization.The performance evaluation is done for a real combustion simulation configuration,with 5 320 896 grid cells,on one Tianhe-2 supercomputer node.The results show that the resulting heterogenous code significantly outperforms the original CPU only code.When the heterogenous code runs on two Intel Xeon E5-2692 CPUs and three Intel Xeon Phi 31S1P coprocessors,the runtime per time-steep is reduced from 64.72 seconds to 21.06 seconds.The heterogeneous computing achieves a speedup of 3.07 times over the original code that only runs on the two Intel Xeon E5-2692 CPUs.
关 键 词:发动机燃烧数值模拟 异构众核平台 Intel集成众核 OpenMP4.0 性能优化
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.148.252.90