一种新的优化的检查点间隔的求解模型  被引量:1

A New Computational Model of Optimized Checkpoint Interval

在线阅读下载全文

作  者:蒋廷耀[1] 李庆华[1] 

机构地区:[1]华中科技大学计算机学院,湖北武汉430074

出  处:《小型微型计算机系统》2003年第3期448-451,共4页Journal of Chinese Computer Systems

基  金:国家高性能计算基金 (993 13 )的资助

摘  要:在具有容错功能的高性能计算环境中 ,由于加入检查点机制会给系统引入额外负载 ,检查点间隔的适当选定能使系统性能优化 .Vaidya的贡献是用他的模型得出的优化的检查点间隔的求解等式独立于检查点潜伏时间 (L )及检查点恢复时间 (R) ,本文介绍了一种新的基于时间分段的模型 NSBM,引入了系统平均利用率这一容错领域更易理解的概念代替 Vaidya模型中的平均负载率并推导出了也是独立于 L及 R的求解等式 .实验结果表明 NSBM的求解模型比Many applications (sequential or parallel) require large amount of time to complete. Such applications can encounter loss of a significant amount of computation if a failure occurs during the execution. Checkpointing and rollback is a technique used to minimize the loss of computation in an environment subject to failures. Unfortunately because of the employment of checkpoint scheme, an additional checkpoint overhead can be introduced to the system. Too big or too small checkpoint interval maybe degrades the performance of system. Proper determination of checkpoint interval can make system performance optimized. The difficulty is how to determine the checkpoint interval, at which condition the performance of checkpoint scheme is optimal. The optimized checkpoint interval's computational equation that was presented in Vaidya's model is independent of the time of checkpoint latency and checkpoint recovery that the application program spends when it rollbacks after an error occurs, which is his great contribution. This paper introduces a new segment based model, presents mean availability that is easier to be understood in fault tolerant instead of checkpoint mean overhead in Vaidya's model and derives a new equation that is also independent of the time of checkpoint latency and recovery. In the end, we give a group of computation results based on the experiment. In addition we analyze the relation of this two model. The conclusion is that the model of NSBM is more effective than the model of Vaidya in respect of the computation of checkpoint interval.

关 键 词:优化 检查点间隔 求解模型 容错 负载率 利用率 时间分段模型 计算机 

分 类 号:TP302.8[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象