Revisiting the Parallel Strategy for DOACROSS Loops  被引量:1

在线阅读下载全文

作  者:Song Liu Yuan-Zhen Cui Nian-Jun Zou Wen-Hao Zhu Dong Zhang Wei-Guo Wu 

机构地区:[1]School of Electronic Information and Engineering, Xi’an Jiaotong University, Xi’an 710049, China [2]School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China [3]Xi’an Research Institute of Surveying and Mapping, Xi’an 710054, China [4]State Key Laboratory of Geo-Information Engineering, Xi’an 710054, China

出  处:《Journal of Computer Science & Technology》2019年第2期456-475,共20页计算机科学技术学报(英文版)

基  金:the National Key Research and Development Program of China under Grant No.2016YFB0201800;the National Natural Science Foundation of China under Grant Nos.91630206 and 91330117.

摘  要:DOACROSS loops are significant parts in many important scientific and engineering applications,which are generally exploited pipeline/wave-front parallelism by loop transformations.However,previous work almost statically performs iterations in parallel threads,thus causing a waste of computing resources in thread synchronization.This paper proposes a brand-new parallel strategy for DOACROSS loops that provides a dynamic task assignment with reduced dependences to achieve wave-front parallelism through loop tiling.The proposed strategy uses a master-slave parallel mode and some customized structures to realize dynamic and flexible parallelization,which effectively avoids threads from waiting in communication.An efficient tile size selection(TSS)approach is also proposed to preserve data reuse in cache for tiled codes.The experimental results show that the proposed parallel strategy obtains good and stable speedups over six typical benchmarks with different problem sizes and different numbers of threads on an Intel■Xeon■32-core serve?.And it outperforms two static strategies,a barrier-based strategy and a post/wait-based strategy,by 32% and 20% in average performance,respectively.This strategy also yields a better performance than a mutex-based dynamic strategy.Besides,it has been demonstrated that the proposed TSS approach can achieve a near-optimal performance and is comparable with a state-of-the-art TSS approach.

关 键 词:DOACROSS LOOP WAVE-FRONT PARALLELISM TILE size selection dynamic task ASSIGNMENT synchronization optimization 

分 类 号:TP[自动化与计算机技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象