Introducing MagBERT:A language model for magnesium textual data mining and analysis  

在线阅读下载全文

作  者:Surjeet Kumar Russlan Jaafreh Nirpendra Singh Kotiba Hamad Dae Ho Yoon 

机构地区:[1]School of Advanced Materials Science&Engineering Sungkyunkwan University,Suwon 16419,South Korea [2]Department of Physics,Khalifa University of Science and Technology,Abu Dhabi 127788,UAE

出  处:《Journal of Magnesium and Alloys》2024年第8期3216-3228,共13页镁合金学报(英文)

基  金:supported by the National Research Foundation of Korea(NRF)grant funded by the Korea government(MSIT)(No.RS-2023-00221186).

摘  要:Magnesium(Mg)based materials hold immense potential for various applications due to their lightweight and high strength-to-weight ratio.However,to fully harness the potential of Mg alloys,structured analytics are essential to gain valuable insights from centuries of accumulated knowledge.Efficient information extraction from the vast corpus of scientific literature is crucial for this purpose.In this work,we introduce MagBERT,a BERT-based language model specifically trained for Mg-based materials.Utilizing a dataset of approximately 370,000 abstracts focused on Mg and its alloys,MagBERT is designed to understand the intricate details and specialized terminology of this domain.Through rigorous evaluation,we demonstrate the effectiveness of MagBERT for information extraction using a fine-tuned named entity recognition(NER)model,named MagNER.This NER model can extract mechanical,microstructural,and processing properties related to Mg alloys.For instance,we have created an Mg alloy dataset that includes properties such as ductility,yield strength,and ultimate tensile strength(UTS),along with standard alloy names.The introduction of MagBERT is a novel advancement in the development of Mg-specific language models,marking a significant milestone in the discovery of Mg alloys and textual information extraction.By making the pre-trained weights of MagBERT publicly accessible,we aim to accelerate research and innovation in the field of Mg-based materials through efficient information extraction and knowledge discovery.

关 键 词:Mg alloys MagBERT BERT NLP Text mining Information extraction 

分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论] TG146.22[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象