检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李安民 计卫星[1] 廖心怡 高建花 谈兆年 王一拙[1] 石峰[1] LI An-min;JI Wei-xing;LIAO Xin-yi;GAO Jian-hua;TAN Zhao-nian;WANG Yi-zhuo;SHI Feng(School of Computer Science & Technology,Beijing Institute of Technology,Beijing 100081,China)
出 处:《计算机工程与科学》2019年第3期424-432,共9页Computer Engineering & Science
基 金:国家自然科学基金(61300010)
摘 要:随着人工智能时代的到来,异构计算在深度学习、科学计算等领域发挥着越来越重要的作用。目前异构计算系统在应用上的瓶颈之一在于缺少高效的软件开发框架,已有的OpenCL、CUDA等支持GPU、DSP及FPGA的编程框架基于C/C++语言和传统的并行编程方法,导致软件开发效率较低,软件推理和调试困难,难以灵活处理计算设备之间的协作和调度。提出一种面向异构计算平台的基于脚本语言的结构化并行编程框架,提供结构化的并行编程接口,支持计算任务到异构计算设备的映射,便于并行程序的推理和验证。设计并实现了基于遗传算法的结构化调度算法,充分利用异构计算系统的计算能力,提高了异构计算系统的软件开发效率。实验结果表明,提出的编程框架在CPU+GPU平台上实现了相对于单处理器1.5到2.5倍的加速比。With the advent of artificial intelligence era, heterogeneous computing has been playing a more and more important role in deep learning and scientific computing. One of the bottlenecks that limit the application of heterogeneous computing systems is a lack of efficient software development framework. Existing programming frameworks like OpenCL and CUDA, base on C/C++ language and traditional parallel programming methods, and support hardware like GPU, DSP and FPGA, which are complained due to their low efficiency in software development as well as the difficulties in software reasoning and debugging, leading to clumsy handling of the cooperation and scheduling between computing devices. We introduce a script-based structural parallel programming framework for heterogeneous computing platforms, which provides a structural parallel programming interface to support the mapping of computing tasks to heterogeneous computing devices, and facilitate the reasoning and verification of parallel programs. We also design and implement a structural scheduling algorithm based on the genetic algorithm, which fully utilizes the computing capability of heterogeneous systems and enhances the efficiency of software development. Experimental results show that the proposed programming framework achieves 1.5×~ 2.5× speedup in comparison to a single processor on the CPU+GPU platform.
分 类 号:TP303[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.30