Towards Efficient Short-Range Pair Interaction on Sunway Many-Core Architecture  被引量:1

在线阅读下载全文

作  者:Jun-Shi Chen Hong An Wen-Ting Han Zeng Lin Xin Liu 

机构地区:[1]School of Computer Science and Technology,University of Science and Technology of China,Hefei 230026,China [2]National Research Center of Parallel Computer Engineering and Technology,Beijing 100080,China

出  处:《Journal of Computer Science & Technology》2021年第1期123-139,共17页计算机科学技术学报(英文版)

基  金:The work was supported by the National Key Research and Development Program of China under Grant No. 2018YFB0204102。

摘  要:The short-range pair interaction consumes most of the CPU time in molecular dynamics(MD)simulations.The inherent computation sparsity makes it challenging to achieve high-performance kernel on the emerging many-core architecture.In this paper,we present a highly efficient short-range force kernel on the Sunway,a novel many-core architecture with many unique features.The parallel efficiency of this algorithm on the Sunway many-core processor is strongly limited by the poor data locality and write conflicts.To enhance the data locality,we adopt a super cluster based neighbor list with an appropriate granularity that fits in the local memory of computing cores.In the absence of a low overhead locking mechanism,using data-privatization force array is a more feasible method to avoid write conflicts,but results in the large overhead of data reduction.We adopt a dual-slice partitioning scheme for both hardware resources and computing tasks,which utilizes the on-chip data communication to reduce data reduction overhead and provide load balancing.Moreover,we exploit the single instruction multiple data(SIMD)parallelism and perform instruction reordering of the force kernel on this many-core processor.The experimental results show that the optimized force kernel obtains a performance speedup of 226x compared with the reference implementation and achieves 20%of peak flop rate on the Sunway many-core processor.

关 键 词:molecular dynamics sunway many-core pair interaction parallel algorithm 

分 类 号:TP31[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象