大规模异构集群上Kirchhoff叠前时间偏移并行算法  被引量:5

Kirchhoff prestack time migration on large heterogeneous computing systems

在线阅读下载全文

作  者:赵长海 罗国安 张旭东 王狮虎 张建磊 王成祥 

机构地区:[1]中国石油东方地球物理公司物探技术研究中心北京分中心,北京100088 [2]中国石油东方地球物理公司物探技术研究中心,河北涿州072751

出  处:《石油地球物理勘探》2016年第5期1040-1048,840,共9页Oil Geophysical Prospecting

基  金:国家重大科技专项(2011ZX05019)资助

摘  要:当前单个勘探项目的数据量已经超过100TB,PB级规模的项目已经可以预见。为适应地震数据快速增长的趋势以及超大规模异构集群的体系结构特点,提出了多维度成像空间分解算法。根据大规模集群系统有多个并行层次的特征,首先沿炮检距方向分解成像空间,然后再沿Inline方向继续切分,直到成像空间小于计算节点物理内存,最后在二维地表上以面元为单位分解成像空间。该并行算法降低了任务间的耦合性,便于映射到异构集群系统的多个并行层次上,也利于异构处理器间的异步执行。相对于同时期的高性能CPU处理器,GPU版本获得了4.8倍的加速,MIC版本获得了2倍的加速,给出了两类协处理在性能、能耗和可编程性方面的对比分析。在Tianhe-1 1024节点规模下处理实际的地震数据,获得了接近线性的加速比曲线。The size of seismic data from a single survey for the moment has reach to 100 TB,and may exceed 1 PB in the near future.To support increasingly huge survey data sizes and processing complexity,we propose a practical approach to largescale parallel processing of 3D prestack Kirchhoff time migration(PSTM)with multi-dimension imaging space decomposition on heterogeneous computing systems.The parallel algorithm is based on three-level decomposition of the imaging space.Firstly,the imaging space is partitioned by offsets.Each node runs just one process,and all processes are divided into several distinct groups.The imaging work of common-offset space is assigned to a group,and the common-offset input traces are dynamically distributed to the processes of the group.Once all input traces are migrated,the local imaging sections of all the processes in a group are added to form the final common-offset image.In a node,the common-offset imaging section is further partitioned equally by CMP.If the size of a common-offset imaging section exceeds the total physical memory on the compute node,the whole imaging space should be firstly partitioned along in-line direction so that each commonoffset imaging space can fit in memory.The algorithm greatly reduces the dependencies among tasks.The task partitions can be easily mapped to multiple heterogeneous processors and execute asynchronously.Compared to the production CPU version of PSTM,its GPU version achieves up to 4.8 speed times and MIC version achieves up to 2 speed times.Comparative analysis of GPU and MIC is also given on power consumption,performance,and programmablity.The PSTM implementation can obtain close to linear speedup when it processes real data on the Tianhe-1 supercomputer.

关 键 词:积分法 叠前Kirchhoff时间偏移 并行算法 炮检距 异构 GPU MIC 

分 类 号:TP631[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象