检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:Kangming Li Brian DeCost Kamal Choudhary Michael Greenwood Jason Hattrick-Simpers
机构地区:[1]Department of Materials Science and Engineering,University of Toronto,27 King’s College Cir,Toronto,ON,Canada [2]Material Measurement Laboratory,National Institute of Standards and Technology,100 Bureau Dr,Gaithersburg,MD,USA [3]Theiss Research,La Jolla,CA 92037,USA [4]Canmet MATERIALS,Natural Resources Canada,183 Longwood Road south,Hamilton,ON,Canada
出 处:《npj Computational Materials》2023年第1期1787-1795,共9页计算材料学(英文)
摘 要:Recent advances in machine learning(ML)have led to substantial performance improvement in material database benchmarks,but an excellent benchmark score may not imply good generalization performance.Here we show that ML models trained on Materials Project 2018 can have severely degraded performance on new compounds in Materials Project 2021 due to the distribution shift.We discuss how to foresee the issue with a few simple tools.Firstly,the uniform manifold approximation and projection(UMAP)can be used to investigate the relation between the training and test data within the feature space.Secondly,the disagreement between multiple ML models on the test data can illuminate out-of-distribution samples.We demonstrate that the UMAP-guided and query by committee acquisition strategies can greatly improve prediction accuracy by adding only 1%of the test data.We believe this work provides valuable insights for building databases and models that enable better robustness and generalizability.
关 键 词:PREDICTION adding CRITICAL
分 类 号:TP181[自动化与计算机技术—控制理论与控制工程] TP3[自动化与计算机技术—控制科学与工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.44