Intel Bensley平台下前端总线竞争对访存密集型程序的性能影响分析  被引量:1

Performance impact analysis of memory-intensive application by front side bus competition on Intel Bensley platform

在线阅读下载全文

作  者:毛晓炜[1] 陶先平[1] 何万青 

机构地区:[1]软件新技术国家重点实验室,南京大学计算机科学与技术系,南京210093 [2]Intel中国有限公司,北京100020

出  处:《南京大学学报(自然科学版)》2010年第2期149-158,共10页Journal of Nanjing University(Natural Science)

基  金:国家“863”计划(2007AA01Z178);江苏省自然科学基金(BK2006712)

摘  要:对称多处理(symmetric multiprocessor,SMP)机群系统因其优越的性价比和良好的可扩展性,已经成为当今高性能计算的主流结构.其中,单节点采用Intel双路四核平台已经逐渐成为目前高性能计算服务器的主流平台.由于一个CPU的四个核心共享一根前端总线,而且两根前端总线并不完全独立,前端总线竞争对访存密集型程序的性能有很大的影响.本文针对Intel Bensley双路四核平台特性,给出了前端总线竞争对访存密集型message passing interface(MPI)程序性能影响的计算模型,并编写程序和利用实例验证的该计算模型的有效性.Systemetric muhiprocessor (SMP) clusters are the mainstream architecture in high performance computing (HPC) because of their good cost performance ratio and excellent scalability. And Intel 2-way Quad-Core platform is the main stream platform on signal node. However, on the popular Intel 2-way Quad-Core platform named Bensley, front side bus(FSI3) competition heavily affects the performance of memory intensive applications because four cores in each CPU share a single FSB and dual FSB are not completely independent. Message Passing Interface (MPI) is both a computer specification and is an implementation that allows many computers to communicate with one another. It is widely accepted by the parallel computing because of its high performance, scalability, and portability. This paper gives a model to predict the performance impact of memory intensive MPI application by FSB competition on Intel Bensley 2 way Quad-Core platform. To discuss the issue, we introduce a new variable called Speeddown to depict the performance decline by FSB competition. Generally, a complex HPC MPI application can be divided into numbers of basic blocks, in which there is continuous and balanced bus utilization. By analyzing the address bus utilization and data bus utilization of the system when running a single basic block process binding on core 0 and the relationship between bus utilization and the number of data read from and write back memory, we deduce the equations to predict the Speeddown when running 2/4/8 basic block processes binding on different cores. For complex memory intensive MPI applications, we focus on its computing time to study the performance impact by FSB competition. Since the computing time can be divided into serial time and parallel time, we analyze their Speeddown when creating 4 or 8 processes binding on certain cores separately. Then a method is introduced to merge them together and create the final performance impact model. A testing application is programmed to validate the effective

关 键 词:访存密集型应用 BENSLEY 前端总线 地址总线利用率 数据总线利用率 

分 类 号:TP302.1[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象