主实体增强型层叠指针网络在中文医学实体关系抽取中的应用  

Application of Subject Enhanced Cascade Binary Pointer Tagging Framework in Chinese Medical Entity and Relation Extraction

在线阅读下载全文

作  者:姜植瀚 昝红英[2] 张莉 JIANG Zhihan;ZAN Hongying;ZHANG Li(Collage of Software,Jilin University,Changchun 130012,China;School of Computer and Artificial Intelligence,Zhengzhou University,Zhengzhou 450001,China;Collage of Life Science,Jilin University,Changchun 130012,China)

机构地区:[1]吉林大学软件学院,长春130012 [2]郑州大学计算机与人工智能学院,郑州450001 [3]吉林大学生命科学学院,长春130012

出  处:《计算机科学》2024年第S01期97-102,共6页Computer Science

摘  要:随着中国医学事业的快速发展,中文医学文本的数量不断增加。为了从这些中文医学文本中提取有价值的信息,并解决中文医学领域的实体关系抽取问题,研究人员已经提出一系列基于双向LSTM的模型。然而,由于双向LSTM的训练速度等问题,文中引入了层叠指针网络框架来处理中文医学文本的实体关系抽取任务。为了弥补层叠指针网络框架中主实体识别能力不足以及解决复用编码层时的梯度问题,文中提出了主实体增强模块,并引入了条件层归一化方法,从而提出了面向中文医学文本的主语增强型层叠指针网络框架(Subject Enhanced Cascade Binary Pointer Tagging Framework for Chinese Medical Text,SE-CAS)。通过引入主实体增强模块,能够精确识别有效的主实体,并排除错误实体。此外,还使用条件层归一化方法来替代原模型中的简单相加方法,并将其应用于编码层和主实体编码层。实验结果证明,所提模型在CMeIE数据集上取得了5.73%的F1值提升。通过消融实验证实,各个模块均能带来性能提升,并且这些提升具有叠加效应。With the rapid advancement of China’s biomedical industry,the volume of Chinese medical texts is escalating at a rapid pace.Extracting valuable information from these texts can ease the learning curve for practitioners.To tackle the challenge of entity relation extraction in the realm of Chinese medicine,a series of models based on bidirectional LSTM have been previously proposed.However,to overcome the training speed bottleneck inherent to bidirectional LSTM,this study introduces the Cascade binary pointer network framework to the domain of Chinese medical filed.To address the framework’s weak capability in identifying main entities and the gradient issues arising from reusing the coding layer,this paper introduces the main entity enhancement module and employs conditional layer normalization.This paper presents the subject enhanced cascade binary pointer tagging framework for chinese medical text(SE-CAS),tailored for Chinese medical text.The subject enhancement module accurately identifies valid subjects detected by the subject recognition module and rectifies erroneously identified entities.Furthermore,the conditional layer normalization method replaces the simplistic addition between word embeddings and subject embeddings found in the original model.Experimental results demonstrate that the proposed model achieves a 5.73%enhancement in F1 measure on the CMeIE dataset.The ablation study confirms the incremental impact of each module,and these improvements exhibit a cumulative effect.

关 键 词:实体关系抽取 层叠指针网络 医学关系抽取 深度学习 主语识别 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象