检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:景辉 秦勃[1] 姜晓轶[2] 夏海涛 JING Hui;QIN Bo;JIANG Xiao-Yi;XIA Hai-Tao(Computer Science and Technology Department,Ocean University of China,Qingdao 266100,China;The Key Laboratory of Digital Oceanic Science and Technology,National Marine Data and Information Service, Tianjin 300171,China)
机构地区:[1]中国海洋大学信息科学与工程学院,山东青岛266100 [2]国家海洋局数字海洋科学技术重点实验室,天津300171
出 处:《中国海洋大学学报(自然科学版)》2018年第A02期180-186,共7页Periodical of Ocean University of China
基 金:海洋环境信息云计算与云服务体系框架应用研究项目(931146140)资助~~
摘 要:大规模长时间序列海洋地理空间数据处理属于计算密集型任务。本文重点介绍Spark框架下如何利用GPU并行计算机制实现海洋地理空间数据分布式并行处理的任务调度,以提高大规模长时间序列海洋地理空间数据处理效率,满足实时交互需求。Spark-GPU框架包括Spark-GPU调度器和Spark-GPU运行时两部分。任务计算量和GPU设备计算能力作为调度策略因子,采用一个多项式时间的2近似算法求解,是一个著名的无关并行机任务调度问题。本文以流场可视化线积分卷积算法作为测试用例,1 000~2 000场的任务调度测试结果表明与原生Spark调度算法相比,Spark-GPU框架执行时间减少了14%~18%,GPU占用比提高了10%~20%。Long time and large scale Oceanographic Geospatial Data(OGD)processing is computation-intensive.This paper focuses on the method of task scheduling for ODG distributed parallel processing based on Spark with GPU,to imporve processing efficiency of long time and large scale OGD,and satisfy real-time interaction requirements.Spark original scheduling algorithms(FIFO,FAIR)shows severe problem,low efficiency and more execution time when running computation-intensive tasks,To solve the problem,this paper presents a Spark-GPU Framework(SGF).SGF includes Spark-GPU Scheduler(SGS)and Spark-GPU Runtime(SGR).SGS takes into consideration of GPU tasks with different computation and GPU devices with different computing capacity.The scheduling is on Unrelated Parallel Machines and deal with a polynomial 2-Approximation Algorithm.SGR uses JNI+CUDA as GPU task runtime.The method of JNI+CUDA use only one JNI call to achieve high efficiency,and is easy to programming and debug.The main contribution of this paper is as follow:(1)Improved Spark-GPU Framework can support more balance scheduling of large scale computation task running,(2)Describe a scheduling algorithm for large scale computation task on heterogeneous GPU devices by consider GPU tasks with different computation and GPU devices with different computing capacity.Flow Field Visualization is as the test application.On a cluster with 10 GPU nodes,1 000~2 000 field tasks evaluation show the SGF can reduce 14%~18%execution time,improve GPU time occupancy ratio 10%~20%.
关 键 词:SPARK 云计算 分布式并行 GPU 任务调度 无关并行机任务调度
分 类 号:TP338.8[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49