基于深度学习的农业科技政策知识抽取方法研究  

Research on Knowledge Extraction Method for Agricultural Science and Technology Policies Based on Deep Learning

在线阅读下载全文

作  者:赵小丹 胡林[1] ZHAO Xiaodan;HU Lin(Agricultural Information Institute of CAAS,Beijing 100081,China)

机构地区:[1]中国农业科学院农业信息研究所,北京100081

出  处:《数据与计算发展前沿(中英文)》2024年第4期106-115,共10页Frontiers of Data & Computing

基  金:国家重点研发计划“面向融合科学场景的应用示范”(2021YFF0704204)。

摘  要:【应用背景】农业科技政策对科技进步和农业生产发展具有重要影响,不同政府部门发布的政策具有针对概念实体的关联性。【目的】针对农业科技政策命名实体识别及关系抽取高度依赖人工设计特征耗时耗力的问题,提出一种基于BERT-BiLSTM-CRF模型的农业科技政策知识抽取方法。【方法】针对领域语料特征,提出一种新标注模式,对三元组直接建模,替代传统的联合抽取或分别建模,将实体关系识别转化为序列标注问题,实验选取政策文本共19,779个句子、376,721个字符,针对政策、行业等8类实体和引用、发布等10种关系进行识别。【结果】使用的BERT-BiLSTM-CRF模型在语料集上准确率为81.61%、召回率为85.34%、F1值为83.47%,实验结果表明,该方法能够有效抽取农业科技政策实体及关系,效果优于其他经典模型。[Application Background]Agricultural science and technology policies have a significant impact on technological progress and the development of agricultural production.Policies issued by different government departments have correlations with conceptual entities.[Objective]Addressing the issue of time-consuming and labor-intensive manual feature design for named entity recognition and relationship extraction in agricultural science and technology policies,this study introduces a knowledge extraction approach utilizing the BERT-BiLSTM-CRF model.[Method]Using a new annotation pattern adapted to the domain corpus,directly modeling triplets,instead of the traditional separate modeling or joint extraction,transforms the entity and relationship extraction problem into a sequence labeling task.The experiment involved 19,779 sentences and 376,721 characters of policy text,identifying eight types of entities such as policy and industry,and ten types of relationships such as citation and publication.[Results]The model achieves an accuracy of 81.61%,a recall of 85.34%,and an F1 score of 83.47%on the corpus.The results of the experiments demonstrate that the suggested approach proficiently extracts entities and relationships related to agricultural science and technology policies,and its performance surpasses that of other classical models.

关 键 词:农业科技政策 BERT-BiLSTM-CRF 知识抽取 实体识别 

分 类 号:S126[农业科学—农业基础科学] TP391.1[自动化与计算机技术—计算机应用技术] TP18[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象