基于跨基本块变换和循环分布的SLP优化技术  

SLP Optimization Algorithm Using Across Basic Block Transformation and Loop Distribution

在线阅读下载全文

作  者:索维毅[1] 赵荣彩[1] 姚远[1] 张小妹 

机构地区:[1]解放军信息工程大学 [2]解放军73311部队自动化站

出  处:《计算机科学》2013年第10期24-28,60,共6页Computer Science

基  金:核高基重大专项(2009ZX01036-001-001-2)资助

摘  要:现有的SLP优化算法无法处理内层循环中存在的依赖环和归约,并且在基本块边界产生大量的冗余拆包和赋值语句,从而导致向量化效率不高。针对该问题,提出了一种基于跨基本块变换和循环分布的SLP优化算法。该算法以控制流图为基础,根据基本块间各数组变量的Define-Use关系以及跨越基本块之间的数据依赖关系进行跨基本块的向量化变换,有序地采用跨基本块变换和循环分布,尽可能发掘最内层循环基本块内语句的并行性,使SLP自动向量化编译器生成具有更多SIMD指令的向量化代码。实验结果表明,该算法能够隐藏更多跨基本块冗余操作的开销,同时利用跨基本块的数据依赖生成更优的SIMD指令,有效地提高了向量化程序的加速比。The existing SLP algorithms cannot handle dependent ring and the reduction of the inner loop, and generate a large number of redundant packet disassembly and assignment statements in a basic block boundary, which leads to the lower quantization efficiency. In order to solve the problem, this paper proposed a SLP optimization algorithm using cross basic block transformation and loop distribution. Based on the control flow graph, according to the basic blocks of the array variable between Define-Use and across basic block data relation between across basic block, the algorithm makes the quantized transform, orderly uses across basic block transform and loop distribution, and then expands inner loop within a basic block sentence parallelism as far as possible, making SLP automatic vectorization compiler to genera te the vectorization code which has more SIMD instruction. The experimental results show that the algorithm can hide more across basic block redundancy operation cost, at the same time generate better SIMD instructions across basic block data dependence, effectively improving the vectorization program speedup.

关 键 词:SLP 跨基本块变换 循环分布 数据依赖 控制流图 Define-Use关系 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象