家禽诊疗文本多实体关系联合抽取模型研究  被引量:6

Joint Extraction Model of Multi-entity Relations for Poultry Diagnosis and Treatment Text

在线阅读下载全文

作  者:胡滨[1] 汤保虎 姜海燕[1,2] 霍傲 韩文笑 HU Bin;TANG Baohu;JIANG Haiyan;HUO Ao;HAN Wenxiao(College of Artificial Intelligence,Nanjing Agricultural University,Nanjing 210095,China;National Engineering and Technology Center for Information Agriculture,Nanjing Agricultural University,Nanjing 210095,China)

机构地区:[1]南京农业大学人工智能学院,南京210095 [2]南京农业大学国家信息农业技术中心,南京210095

出  处:《农业机械学报》2021年第6期268-276,共9页Transactions of the Chinese Society for Agricultural Machinery

基  金:国家重点研发计划项目(2016YFD0300607)。

摘  要:针对传统实体关系抽取方法中主体特征与句向量难以有效融合、现有BIO标注策略难以有效处理重叠关系的问题,提出一种基于BERT和双重指针标注的家禽疾病诊疗文本实体关系联合抽取模型(Joint extraction of entity relationship of poultry disease diagnosis and treatment text,JEER_PD)。JEER_PD使用双重指针标注(Dual-pointer labeling,DPL)策略,建立头、尾2个指针标注器,一次性标注出所有实体的开始和结束位置;引入CLN(Conditional layer normalization)网络层,强化主体抽取任务与客体关系联合抽取任务之间的联系;利用概率平衡策略PBS对抗正负类标签类别失衡,以加速模型收敛。实验表明,JEER_PD准确率、召回率和F1分别为97.69%、97.59%和97.64%,3项指标较现有方法均有显著提升,说明JEER_PD能够快速、准确地抽取家禽疾病诊疗复杂知识文本中的实体关系三元组。Aiming at the problems that the subject feature and sentence vector in the traditional entity relationship extraction method are difficult to effectively integrate,and the existing BIO annotation strategy is difficult to effectively deal with the overlapping relationships,a joint extraction of entity relationship of poultry disease diagnosis and treatment text(JEER_PD)based on BERT and dual-pointer was proposed.JEER_PD used the dual-pointer labeling(DPL)strategy to establish two pointer labelers at the head and tail,marking the beginning and ending positions of all entities at once;introduced the conditional layer normalization(CLN)network layer to strengthen the connection between the subject extraction task and the object relationship joint extraction task;and used the probability balance strategy(PBS)to combat the imbalance of positive and negative labels to accelerate the model convergence.The experimental results showed that the accuracy,recall and F1 value of JEER_PD were 97.69%,97.59%and 97.64%,respectively,and the three indicators were significantly improved compared with that of the existing methods,which proved that JEER_PD can quickly and accurately extract the entity relationship triples in the complex knowledge text of the diagnosis and treatment of poultry diseases.

关 键 词:家禽疾病诊疗文本 实体关系抽取 关系重叠 BERT语言模型 双重指针标注 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象