检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]江南计算技术研究所,江苏无锡214083 [2]国家并行计算机工程技术研究中心,北京100080
出 处:《计算机学报》2015年第5期1044-1055,共12页Chinese Journal of Computers
基 金:国家"八六三"高技术研究发展计划项目基金(2012AA010903)资助~~
摘 要:高性能计算机系统规模的持续增大使通信墙问题越来越突出.逻辑进程与物理拓扑的映射优化方法能够提高应用的通信效率,已经成为高性能计算的研究热点之一.传统的进程映射优化模型由于映射粒度过细,导致映射效率低,且易破坏通信密集的进程簇的整体性.为此,文中提出了一种聚合的二次分配问题(Aggregated Quadratic Assignment Problem,AQAP)模型,并以AQAP模型为指导,提出了一种新颖的基于聚类分析的进程映射优化方法.该方法首先使用谱聚类算法对进程通信模式进行聚类分析,然后采用自适应聚合进程映射策略实现进程簇到物理拓扑的映射,最后使用聚合Pair-Exchange算法对进程簇映射进行进一步优化.文中提出的优化方法首次将谱聚类分析应用于进程映射问题,可以有效减少远距离通信,增强通信的局部性.NPB基准程序及两道实际应用的实验结果表明,文中提出的进程映射优化方法可以使程序获得明显的性能提升,优于现有的基于Pair-Exchange以及基于图划分的进程映射方法.With the increase of the scale of high-performance computers, the communication wall problem is becoming increasingly severe. Optimizing for process-to-core mapping can help improving applications' communication efficiency. Due to the over fine mapping grain, the traditional process mapping optimization model leads to low mapping efficiency and tends to split the process clusters within which communication is dense. To settle this problem, we propose an aggregated quadratic assignment problem (AQAP) model. Guided by AQAP model, we propose a novel process mapping optimization method based on clustering analysis. In this method, we first use spectral clustering algorithm to analyze process communication pattern, followed by mapping the process clusters to physical topology using self-adaption aggregated process mapping strategy, and finally optimizing the mapping result further by using aggregated Pair-Exchange algorithm. To our knowledge, this is the first instance where the spectral clustering algorithm has been applied to solve the process placement problem. Our method can effectively reduce long-distance communications as well as enhance the communication locality. We evaluated the performance of our method with the NPB benchmarks and two practical applications. Experimental results show that the optimized process placement generated by our method can achieve significant performance improvement, and outperform existing Pair-Exchange-based and Graph Partition- based methods.
分 类 号:TP319[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.147.44.46