Tracking materials science data lineage to manage millions of materials experiments and analyses  

在线阅读下载全文

作  者:Edwin Soedarmadji Helge S.Stein Santosh K.Suram Dan Guevarra John M.Gregoire 

机构地区:[1]Joint Center for Artificial Photosynthesis,California Institute of Technology,Pasadena,CA,USA [2]Toyota Research Institute,Los Altos,CA 94022,USA

出  处:《npj Computational Materials》2019年第1期460-468,共9页计算材料学(英文)

基  金:This study and the acquisition of all data is based upon work performed by the Joint Center for Artificial Photosynthesis,a DOE Energy Innovation Hub,supported through the Office of Science of the US Department of Energy(Award No.DE-SC0004993);Use of the Stanford Synchrotron Radiation Lightsource,SLAC National Accelerator Laboratory,is supported by the US Department of Energy,Office of Science,Office of Basic Energy Sciences under Contract No.DE-AC02-76SF00515.

摘  要:In an era of rapid advancement of algorithms that extract knowledge from data,data and metadata management are increasingly critical to research success.In materials science,there are few examples of experimental databases that contain many different types of information,and compared with other disciplines,the database sizes are relatively small.Underlying these issues are the challenges in managing and linking data across disparate synthesis and characterization experiments,which we address with the development of a lightweight data management framework that is generally applicable for experimental science and beyond.Five years of managing experiments with this system has yielded the Materials Experiment and Analysis Database(MEAD)that contains raw data and metadata from millions of materials synthesis and characterization experiments,as well as the analysis and distillation of that data into property and performance metrics via software in an accompanying open source repository.The unprecedented quantity and diversity of experimental data are searchable by experiment and analysis attributes generated by both researchers and data processing software.The search web interface allows users to visualize their search results and download zipped packages of data with full annotations of their lineage.The enormity of the data provides substantial challenges and opportunities for incorporating data science in the physical sciences,and MEAD’s data and algorithm management framework will foster increased incorporation of automation and autonomous discovery in materials and chemistry research.

关 键 词:LINKING DISTILLATION AUTONOMOUS 

分 类 号:TP3[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象