使用内存缓存的迭代应用编程框架

Mem Loop: A Programming Framework Using In-Memory Cache for Iterative Application

机构地区：[1]中国科学院软件研究所基础软件国家工程研究中心,北京100190 [2]中国科学院大学,北京100190

出　　处：《计算机系统应用》2015年第3期44-49,共6页Computer Systems & Applications

基　　金：国家自然科学基金(61100067)

摘　　要：迭代式计算是一类重要的大数据分析应用.在分布式计算框架MapReduce上实现迭代计算时,计算会被分解成多个作业并按作业依存关系顺序运行,这使得程序与分布式文件系统(DFS)有多次交互而影响程序执行时间.对这些交互相关数据的缓存会降低与DFS的交互时间,进而提升程序总体的性能.考虑到集群中的大量内存在多数情况下会处于空闲状态,提出了一种使用内存缓存的迭代式应用编程框架MemLoop.该系统从作业提交API、调度算法、缓存管理模块实现缓存管理以充分利用内存缓存迭代间可驻留数据与迭代内依存数据.我们将此框架与已有相关框架进行了比较,实验结果表明该框架能够提升迭代程序的性能.The iterative computation is an important big data analysis application. While implementing iterative computation on the distributed computation framework Map Reduce, the iterative program will be divided into more than one jobs which run in the order defined by the dependencies between jobs, which lead to many interactions between the program and distributed file system（DFS） that will affect the program＇s execution time. Caching these interaction-related data will reduce the time of interactions between the program and DFS and hence improve the overall performance of application. Considering that large amount of memory in cluster nodes is unused at most time, this paper proposes a programming framework called Mem Loop using memory cache for iterative application. This system sufficiently uses the free memory in the cluster＇s nodes to cache data by implementing the memory caching management from three models： job submit API, task scheduling algorithm, cache management. The cached data is classified into two categories： inter-iteration resident data and intra-iteration dependent data. We compare this framework with previous related framework. The result shows that Mem Loop can improve the performance of iterative program.

关键词：作业依存内存缓存迭代程序迭代间可驻留数据迭代内依存数据

分类号：TP333[自动化与计算机技术—计算机系统结构] TP311.1[自动化与计算机技术—计算机科学与技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

使用内存缓存的迭代应用编程框架

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

使用内存缓存的迭代应用编程框架

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索