机构地区:[1]Hubei Key Laboratory of Intelligent Vision Based Monitoring for Hydroelectric Engineering,China Three Gorges University,Yichang 443002,China [2]College of Computer and Information Technology,China Three Gorges University,Yichang 443002,China [3]Hubei Engineering Technology Research Center for Farmland Environment Monitoring,China Three Gorges University,Yichang 443002,China [4]School of Computer Science,China University of Geosciences,Wuhan 430074,China [5]Key Laboratory of Urban Land Resources Monitoring and Simulation,Ministry of Natural Resources,Shenzhen 518034,China [6]Beijing Key Laboratory of Urban Spatial Information Engineering,Beijing 100045,China [7]College of Economics and Management,China Three Gorges University,Yichang 443002,China
出 处:《Journal of Earth Science》2023年第5期1390-1405,共16页地球科学学刊(英文版)
基 金:financially supported by the National Key R&D Program of China (No.2022YFF0711601);the Natural Science Foundation of Hubei Province of China (No.2022CFB640);the Open Fund of Key Laboratory of Urban Land Resources Monitoring and Simulation,Ministry of Natural Resources (No.KF-2022-07-014);the Opening Fund of Hubei Key Laboratory of Intelligent Vision-Based Monitoring for Hydroelectric Engineering (No.2022SDSJ04);the Beijing Key Laboratory of Urban Spatial Information Engineering (No.20220108)。
摘 要:Geological knowledge can provide support for knowledge discovery, knowledge inference and mineralization predictions of geological big data. Entity identification and relationship extraction from geological data description text are the key links for constructing knowledge graphs. Given the lack of publicly annotated datasets in the geology domain, this paper illustrates the construction process of geological entity datasets, defines the types of entities and interconceptual relationships by using the geological entity concept system, and completes the construction of the geological corpus. To address the shortcomings of existing language models(such as Word2vec and Glove) that cannot solve polysemous words and have a poor ability to fuse contexts, we propose a geological named entity recognition and relationship extraction model jointly with Bidirectional Encoder Representation from Transformers(BERT) pretrained language model. To effectively represent the text features, we construct a BERT-bidirectional gated recurrent unit network(BiGRU)-conditional random field(CRF)-based architecture to extract the named entities and the BERT-BiGRU-Attention-based architecture to extract the entity relations. The results show that the F1-score of the BERT-BiGRU-CRF named entity recognition model is 0.91 and the F1-score of the BERT-BiGRU-Attention relationship extraction model is 0.84, which are significant performance improvements when compared to classic language models(e.g., word2vec and Embedding from Language Models(ELMo)).
关 键 词:ONTOLOGY BERT model name entity recognition relation extraction knowledge graph
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...