A critical examination of robustness and generalizability of machine learning prediction of materials properties  被引量:2

在线阅读下载全文

作  者:Kangming Li Brian DeCost Kamal Choudhary Michael Greenwood Jason Hattrick-Simpers 

机构地区:[1]Department of Materials Science and Engineering,University of Toronto,27 King’s College Cir,Toronto,ON,Canada [2]Material Measurement Laboratory,National Institute of Standards and Technology,100 Bureau Dr,Gaithersburg,MD,USA [3]Theiss Research,La Jolla,CA 92037,USA [4]Canmet MATERIALS,Natural Resources Canada,183 Longwood Road south,Hamilton,ON,Canada

出  处:《npj Computational Materials》2023年第1期1787-1795,共9页计算材料学(英文)

摘  要:Recent advances in machine learning(ML)have led to substantial performance improvement in material database benchmarks,but an excellent benchmark score may not imply good generalization performance.Here we show that ML models trained on Materials Project 2018 can have severely degraded performance on new compounds in Materials Project 2021 due to the distribution shift.We discuss how to foresee the issue with a few simple tools.Firstly,the uniform manifold approximation and projection(UMAP)can be used to investigate the relation between the training and test data within the feature space.Secondly,the disagreement between multiple ML models on the test data can illuminate out-of-distribution samples.We demonstrate that the UMAP-guided and query by committee acquisition strategies can greatly improve prediction accuracy by adding only 1%of the test data.We believe this work provides valuable insights for building databases and models that enable better robustness and generalizability.

关 键 词:PREDICTION adding CRITICAL 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程] TP3[自动化与计算机技术—控制科学与工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象