检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:Ming Lin Meng Jin Yufu Liu Yuqi Bai
出 处:《International Journal of Digital Earth》2022年第1期1290-1304,共15页国际数字地球学报(英文)
基 金:supported by the National Key Research and Development Program of China:[grant number 2019YFE0126400].
摘 要:Earth observations,especially satellite data,have produced a wealth of methods and results in meeting global challenges,often presented in unstructured texts such as papers or reports.Accurate extraction of satellite and instrument entities from these unstructured texts can help to link and reuse Earth observation resources.The direct use of an existing dictionary to extract satellite and instrument entities suffers from the problem of poor matching,which leads to low recall.In this study,we present a named entity recognition model to automatically extract satellite and instrument entities from unstructured texts.Due to the lack of manually labeled data,we apply distant supervision to automatically generate labeled training data.Accordingly,we fine-tune the pre-trained language model with early stopping and a weighted cross-entropy loss function.We propose the dictionary-based self-training method to correct the incomplete annotations caused by the distant supervision method.Experiments demonstrate that our method achieves significant improvements in both precision and recall compared to dictionary matching or standard adaptation of pre-trained language models.
关 键 词:Earth observation named entity recognition pre-trained language model distant supervision dictionary-based self-training
分 类 号:P3[天文地球—地球物理学]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.133.83.123