A disk failure prediction model for multiple issues  

在线阅读下载全文

作  者:Yunchuan GUAN Yu LIU Ke ZHOU Qiang LI Tuanjie WANG Hui LI 

机构地区:[1]Wuhan National Laboratory for Optoelectronics,Huazhong University of Science and Technology,Wuhan 430074,China [2]School of Computer Science and Technology,Huazhong University of Science and Technology,Wuhan 430074,China [3]Inspur Electronic Informantion Industry Co.,Ltd.,Beijing 250000,China

出  处:《Frontiers of Information Technology & Electronic Engineering》2023年第7期964-979,共16页信息与电子工程前沿(英文版)

基  金:Project supported by the National Natural Science Foundation of China(No.61902135);the Shandong Provincial Natural Science Foundation,China(No.ZR2019LZH003)。

摘  要:Disk failure prediction methods have been useful in handing a single issue,e.g.,heterogeneous disks,model aging,and minority samples.However,because these issues often exist simultaneously,prediction models that can handle only one will result in prediction bias in reality.Existing disk failure prediction methods simply fuse various models,lacking discussion of training data preparation and learning patterns when facing multiple issues,although the solutions to different issues often conflict with each other.As a result,we first explore the training data preparation for multiple issues via a data partitioning pattern,i.e.,our proposed multi-property data partitioning(MDP).Then,we consider learning with the partitioned data for multiple issues as learning multiple tasks,and introduce the model-agnostic meta-learning(MAML)framework to achieve the learning.Based on these improvements,we propose a novel disk failure prediction model named MDP-MAML.MDP addresses the challenges of uneven partitioning and difficulty in partitioning by time,and MAML addresses the challenge of learning with multiple domains and minor samples for multiple issues.In addition,MDP-MAML can assimilate emerging issues for learning and prediction.On the datasets reported by two real-world data centers,compared to state-of-the-art methods,MDP-MAML can improve the area under the curve(AUC)and false detection rate(FDR)from 0.85 to0.89 and from 0.85 to 0.91,respectively,while reducing false alarm rate(FAR)from 4.88%to 2.85%.

关 键 词:Storage system reliability Disk failure prediction Self-monitoring analysis and reporting technology(SMART) Machine learning 

分 类 号:TP333[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象