Bayesian serial revision method for RLLC cluster systems failure prediction  

Bayesian serial revision method for RLLC cluster systems failure prediction

在线阅读下载全文

作  者:Qiang Liu Guang Jin Jinglun Zhou Quan Sun Min Xi 

机构地区:[1]College of Information System and Management, National University of Defense Technology, Changsha 410073, P. R. China [2]School of Computer Science, McGill University, Montreal H3A2A7, Canada [3]School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta 303320205, USA [4]Department of Computer Science, Xi'an Jiaotong University, Xi'an 710049, P. R. China

出  处:《Journal of Systems Engineering and Electronics》2011年第2期238-246,共9页系统工程与电子技术(英文版)

基  金:supported by the National Natural Science Foundationof China (60701006; 60804054; 71071158)

摘  要:Failure prediction plays an important role for many tasks such as optimal resource management in large-scale system. However, accurately failure number prediction of repairable large-scale long-running computing (RLLC) is a challenge because of the reparability and large-scale. To address the challenge, a general Bayesian serial revision prediction method based on Bootstrap approach and moving average approach is put forward, which can make an accurately prediction for the failure number. To demonstrate the performance gains of our method, extensive experiments on the data of Los Alamos National Laboratory (LANL) cluster is implemented, which is a typical RLLC system. And experimental results show that the prediction accuracy of our method is 80.2 %, and it is a greatly improvement with 4 % compared with some typical methods. Finally, the managerial implications of the models are discussed.Failure prediction plays an important role for many tasks such as optimal resource management in large-scale system. However, accurately failure number prediction of repairable large-scale long-running computing (RLLC) is a challenge because of the reparability and large-scale. To address the challenge, a general Bayesian serial revision prediction method based on Bootstrap approach and moving average approach is put forward, which can make an accurately prediction for the failure number. To demonstrate the performance gains of our method, extensive experiments on the data of Los Alamos National Laboratory (LANL) cluster is implemented, which is a typical RLLC system. And experimental results show that the prediction accuracy of our method is 80.2 %, and it is a greatly improvement with 4 % compared with some typical methods. Finally, the managerial implications of the models are discussed.

关 键 词:failure prediction cluster systems Bayesian approach failure rate. 

分 类 号:TP334.7[自动化与计算机技术—计算机系统结构] TH17[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象