面向通讯同步的多处理器阵列重构  

Reconfiguring Multiprocessor Arrays for Synchronous Communication

在线阅读下载全文

作  者:吴亚兰[1] 武继刚[1] 姜文超[1] 刘竹松[1] 

机构地区:[1]广东工业大学计算机学院,广州510006

出  处:《计算机科学》2017年第7期47-56,共10页Computer Science

基  金:国家自然科学基金项目(61572144);广东省科技计划应用专项基金(2015B010129014);广东省自然科学基金项目(2016A030313703)资助

摘  要:从多处理器阵列中获取所需大小并且同步通讯性能优良的子阵列,是高性能拓扑重构的核心问题之一。基于不同的逻辑列剔除策略提出了3种面向通讯同步的拓扑重构算法:基于分治思想剔除逻辑列的重构算法(SCA_01),该算法能够使被优化的逻辑列相对均匀地分布在物理阵列中;优先剔除长逻辑列的贪心重构算法(SCA_02),该算法能够使被优化的逻辑列的长链接总数最少;基于分治与长链接数的混成重构算法(SCA_03),该算法将某一区域内的最长逻辑列剔除,且尽可能将剩余逻辑列均匀分布在物理阵列中。同时,对逻辑阵列的最大通讯延时给出了下界的求解算法。实验结果表明,3种算法在故障率小于1%、逻辑列的剔除率超过20%时,算法重构出的逻辑阵列的通讯延时特别接近计算出的性能下界。在多数情况下SCA_01优于SCA_02和SCA_03,而后两者的性能相近。在小阵列上且故障率与剔除率较小时,SCA_02具有性能优势,但在大阵列上SCA_03具有优势。在32×32的阵列上,SCA_01构造的阵列产生的通讯延时较SCA_02和SCA_03产生的延时平均减少25%,并且运行速度也提升了19.4%。Reconfiguring VLSI arrays to get a logical array with given size and synchronous communication is one of the key problems in reconfigurable topology for high performance computing.This paper presented three algorithms based on three different strategies of excluding logical co-lumns.The first algorithm,named SCA01,can make the logical columns in the uniform distribution in the host array,based on the divide-and-conquer for excluding logical columns.The second algorithm,named SCA02,can minimize the number of the long interconnects of the logical array,based on the excluding the long logical column in priority.The third algorithm,named SCA03,keeps the logical columns distributed in uniform way based on the hybrid strategy from excluding the long logical column and divide-and-conquer.In addition,this paper contributed an algorithm to calculate the lower bound of the communication delay for the given logical array.Simulation results show that,the communication delay of the logical array reconstructed by three algorithms is close to the lower bound proposed in this paper,when fault rate is less than 1%and the exclusion rate of logical columns is over20%.Algorithm SCA01is superior to SCA02and SCA03in most cases,while SCA02and SCA03have nearly the same performance.But SCA02is better on smaller arrays and SCA03is better on large arrays,when the fault rate and exclusion rate are relatively small.The communication delay generated by SCA01is less than that of SCA02and SCA03by 25% on 32×32host arrays.Moreover,SCA01is faster than the other two algorithms,and the running time is saved by 19.4%.It is concluded that SCA01is one of the relatively desirable algorithms to generate the logical arrays with minimum communication delay for high performance computing.

关 键 词:VLSI阵列 拓扑重构 容错 分治 算法 

分 类 号:TP303[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象